lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250428022240.GC6134@sol.localdomain>
Date: Sun, 27 Apr 2025 19:22:40 -0700
From: Eric Biggers <ebiggers@...nel.org>
To: Autumn Ashton <misyl@...ggi.es>
Cc: Kent Overstreet <kent.overstreet@...ux.dev>,
	Matthew Wilcox <willy@...radead.org>, Theodore Ts'o <tytso@....edu>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-bcachefs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL] bcachefs fixes for 6.15-rc4

On Mon, Apr 28, 2025 at 03:05:19AM +0100, Autumn Ashton wrote:
> 
> 
> On 4/28/25 2:43 AM, Kent Overstreet wrote:
> > On Sun, Apr 27, 2025 at 06:30:59PM -0700, Eric Biggers wrote:
> > > On Sun, Apr 27, 2025 at 08:55:30PM -0400, Kent Overstreet wrote:
> > > > The thing is, that's exactly what we're doing. ext4 and bcachefs both
> > > > refer to a specific revision of the folding rules: for ext4 it's
> > > > specified in the superblock, for bcachefs it's hardcoded for the moment.
> > > > 
> > > > I don't think this is the ideal approach, though.
> > > > 
> > > > That means the folding rules are "whatever you got when you mkfs'd".
> > > > Think about what that means if you've got a fleet of machines, of
> > > > different ages, but all updated in sync: that's a really annoying way
> > > > for gremlins of the "why does this machine act differently" variety to
> > > > creep in.
> > > > 
> > > > What I'd prefer is for the unicode folding rules to be transparently and
> > > > automatically updated when the kernel is updated, so that behaviour
> > > > stays in sync. That would behave more the way users would expect.
> > > > 
> > > > But I only gave this real thought just over the past few days, and doing
> > > > this safely and correctly would require some fairly significant changes
> > > > to the way casefolding works.
> > > > 
> > > > We'd have to ensure that lookups via the case sensitive name always
> > > > works, even if the casefolding table the dirent was created with give
> > > > different results that the currently active casefolding table.
> > > > 
> > > > That would require storing two different "dirents" for each real dirent,
> > > > one normalized and one un-normalized, because we'd have to do an
> > > > un-normalized lookup if the normalized lookup fails (and vice versa).
> > > > Which should be completely fine from a performance POV, assuming we have
> > > > working negative dentries.
> > > > 
> > > > But, if the unicode folding rules are stable enough (and one would hope
> > > > they are), hopefully all this is a non-issue.
> > > > 
> > > > I'd have to gather more input from users of casefolding on other
> > > > filesystems before saying what our long term plans (if any) will be.
> > > 
> > > Wouldn't lookups via the case-sensitive name keep working even if the
> > > case-insensitivity rules change?  It's lookups via a case-insensitive name that
> > > could start producing different results.  Applications can depend on
> > > case-insensitive lookups being done in a certain way, so changing the
> > > case-insensitivity rules can be risky.
> > 
> > No, because right now on a case-insensitive filesystem we _only_ do the
> > lookup with the normalized name.
> > 
> > > Regardless, the long-term plan for the case-insensitivity rules should be to
> > > deprecate the current set of rules, which does Unicode normalization which is
> > > way overkill.  It should be replaced with a simple version of case-insensitivity
> > > that matches what FAT does.  And *possibly* also a version that matches what
> > > NTFS does (a u16 upcase_table[65536] indexed by UTF-16 coding units), if someone
> > > really needs that.
> > > 
> > > As far as I know, that was all that was really needed in the first place.
> > > 
> > > People misunderstood the problem as being about language support, rather than
> > > about compatibility with legacy filesystems.  And as a result they incorrectly
> > > decided they should do Unicode normalization, which is way too complex and has
> > > all sorts of weird properties.
> > 
> > Believe me, I do see the appeal of that.
> > 
> > One of the things I should really float with e.g. Valve is the
> > possibility of providing tooling/auditing to make it easy to fix
> > userspace code that's doing lookups that only work with casefolding.
> 
> This is not really about fixing userspace code that expects casefolding, or
> providing some form of stopgap there.
> 
> The main need there is Proton/Wine, which is a compat layer for Windows
> apps, which needs to pretend it's on NTFS and everything there expects
> casefolding to work.
> 
> No auditing/tooling required, we know the problem. It is unavoidable.
> 
> I agree with the calling about Unicode normalization being odd though, when
> I was implementing casefolding for bcachefs, I immediately thought it was a
> huge hammer to do full normalization for the intended purpose, and not just
> a big table...
> 
> FWIR, there is actually two forms of casefolding in unicode, full
> casefolding, C+F, (eg. ß->ss) and the simpler one, simple casefolding (C+S),
> where lengths don't change and it's glyph for glyph.

Yet, ext4 and f2fs's (and now bcachefs's...) "casefolding" is *not* compatible
with NTFS.

Nor is it compatible with FAT (which is what Android needed).

Nor does it actually do Unicode casefolding
(https://www.unicode.org/Public/16.0.0/ucd/CaseFolding.txt), but rather
Unicode normalization which is more complex.

I suspect that all that was really needed was case-insensitivity of ASCII a-z.
All of these versions of case-insensitivity provide that, so that is why they
may "seem" compatible...

- Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ