On Mon, 2021-05-10 at 12:26 +0200, Mauro Carvalho Chehab wrote:
There are several UTF-8 characters at the Kernel's documentation.
Several of them were due to the process of converting files from DocBook, LaTeX, HTML and Markdown. They were probably introduced by the conversion tools used on that time.
Other UTF-8 characters were added along the time, but they're easily replaceable by ASCII chars.
As Linux developers are all around the globe, and not everybody has UTF-8 as their default charset, better to use UTF-8 only on cases where it is really needed.
No, that is absolutely the wrong approach.
If someone has a local setup which makes bogus assumptions about text encodings, that is their own mistake.
We don't do them any favours by trying to *hide* it in the common case so that they don't notice it for longer.
There really isn't much excuse for such brokenness, this far into the 21st century.
Even *before* UTF-8 came along in the final decade of the last millennium, it was important to know which character set a given piece of text was encoded in.
In fact it was even *more* important back then, we couldn't just assume UTF-8 everywhere like we can in modern times.
Git can already do things like CRLF conversion on checking files out to match local conventions; if you want to teach it to do character set conversions too then I suppose that might be useful to a few developers who've fallen through a time warp and still need it. But nobody's ever bothered before because it just isn't necessary these days.
Please *don't* attempt to address this anachronistic and esoteric "requirement" by dragging the kernel source back in time by three decades.