lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ahdxc464lydwmyqugl472r3orhrj5dasevw5f6edsdhj3dm6zc@lolmht6hpi6t>
Date: Sun, 27 Apr 2025 20:55:30 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Matthew Wilcox <willy@...radead.org>, Theodore Ts'o <tytso@....edu>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>, 
	linux-bcachefs@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL] bcachefs fixes for 6.15-rc4

On Fri, Apr 25, 2025 at 08:40:35PM +0100, Matthew Wilcox wrote:
> On Fri, Apr 25, 2025 at 09:35:27AM -0700, Linus Torvalds wrote:
> > Now, if filesystem people were to see the light, and have a proper and
> > well-designed case insensitivity, that might change. But I've never
> > seen even a *whiff* of that. I have only seen bad code that
> > understands neither how UTF-8 works, nor how unicode works (or rather:
> > how unicode does *not* work - code that uses the unicode comparison
> > functions without a deeper understanding of what the implications
> > are).
> > 
> > Your comments blaming unicode is only another sign of that.
> > 
> > Because no, the problem with bad case folding isn't in unicode.
> > 
> > It's in filesystem people who didn't understand - and still don't,
> > after decades - that you MUST NOT just blindly follow some external
> > case folding table that you don't understand and that can change over
> > time.
> 
> I think this is something that NTFS actually got right.  Each filesystem
> carries with it a 128KiB table that maps each codepoint to its
> case-insensitive equivalent.  So there's no ambiguity about "which
> version of the unicode standard are we using", "Does the user care
> about Turkish language rules?", "Is Aachen a German or Danish word?".
> The sysadmin specified all that when they created the filesystem, and it
> doesn't matter what the Unicode standard changes in the future; if you
> need to change how the filesystem sorts things, you can update the table.
> 
> It's not the perfect solution, but it might be the least-bad one I've
> seen.

The thing is, that's exactly what we're doing. ext4 and bcachefs both
refer to a specific revision of the folding rules: for ext4 it's
specified in the superblock, for bcachefs it's hardcoded for the moment.

I don't think this is the ideal approach, though.

That means the folding rules are "whatever you got when you mkfs'd".
Think about what that means if you've got a fleet of machines, of
different ages, but all updated in sync: that's a really annoying way
for gremlins of the "why does this machine act differently" variety to
creep in.

What I'd prefer is for the unicode folding rules to be transparently and
automatically updated when the kernel is updated, so that behaviour
stays in sync. That would behave more the way users would expect.

But I only gave this real thought just over the past few days, and doing
this safely and correctly would require some fairly significant changes
to the way casefolding works.

We'd have to ensure that lookups via the case sensitive name always
works, even if the casefolding table the dirent was created with give
different results that the currently active casefolding table.

That would require storing two different "dirents" for each real dirent,
one normalized and one un-normalized, because we'd have to do an
un-normalized lookup if the normalized lookup fails (and vice versa).
Which should be completely fine from a performance POV, assuming we have
working negative dentries.

But, if the unicode folding rules are stable enough (and one would hope
they are), hopefully all this is a non-issue.

I'd have to gather more input from users of casefolding on other
filesystems before saying what our long term plans (if any) will be.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ