[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d87f7b76-8a53-4023-81e2-5d257c90acc2@zytor.com>
Date: Wed, 30 Apr 2025 19:48:20 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: "Theodore Ts'o" <tytso@....edu>,
Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Kent Overstreet <kent.overstreet@...ux.dev>,
linux-bcachefs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL] bcachefs fixes for 6.15-rc4
On 4/25/25 12:59, Theodore Ts'o wrote:
>
> Another use case was Valve who wanted to support Windows games that
> expcted case folding to work. (Microsoft Windows; the gift that keeps
> on giving...) In fact the engineer who worked on case folding was
> paid by Valve to do the work.
>
> That being said, I completely agree with Linus that case insensitivity
> is a nightmare, and I don't really care about performance. The use
> cases where people care about this don't have directories with a large
> number of entries, and we **really** don't want to encourage more use
> of case insensitive lookups. There's a reason why spent much effort
> improving the CLI tools' support for case folding. It's good enough
> that it works for Android and Valve, and that's fine.
>
[...]
>
> Perhaps if we were going to do it all over, we might have only
> supported ASCII, or ISO Latin-1, and not used Unicode at all. But
> then I'm sure Valve or Android mobile handset manufacturers would be
> unhappy that this might not be good enough for some country that they
> want to sell into, like, say, Japan or more generally, any country
> beyond US and Europe.
>
> What we probably could do is to create our own table that didn't
> support all Unicode scripts, but only the ones which are required by
> Valve and Android. But that would require someone willing to do this
> work on a volunteer basis, or confinuce some company to pay to do this
> work. We could probably reduce the kernel size by doing this, and it
> would probably make the code more maintainable. I'm just not sure
> anyone thinks its worthwhile to invest more into it. In fact, I'm a
> bit surprised Kent decided he wanted to add this feature into bcachefs.
>
> Sometimes, partitioning a feature which is only needed for backwards
> compatibiltiy with is in fact the right approach. And throwing good
> money after bad is rarely worth it.
>
[Yes, I realize I'm really late to weigh in on this discussion]
It is worth noting that Microsoft has basically declared their
"recommended" case folding (upcase) table to be permanently frozen (for
new filesystem instances in the case where they use an on-disk
translation table created at format time.) As far as I know they have
never supported anything other than 1:1 conversion of BMP code points,
nor normalization.
The exFAT specification enumerates the full recommended upcase table,
although in a somewhat annoying format (basically a hex dump of
compressed data):
https://learn.microsoft.com/en-us/windows/win32/fileio/exfat-specification
This is basically an admission that the problems involved with case
folding are unsolvable, and just puts a tourniquet on the wound.
It also means that "legacy OS compatibility" is really a totally
different problem than "proper Unicode normalization" and that the
former far more limited in scope.
-hpa
Powered by blists - more mailing lists