[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHk-=whAMYsprEG8idgtvmzsx_Si5r3Hzetp39s3Rkkm=6+eSA@mail.gmail.com>
Date: Thu, 19 Jan 2023 10:24:18 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Andreas Dilger <adilger@...ger.ca>
Cc: "Theodore Ts'o" <tytso@....edu>,
Eric Biggers <ebiggers@...nel.org>,
Ext4 Developers List <linux-ext4@...r.kernel.org>,
Eric Whitney <enwlinux@...il.com>,
"Jason A. Donenfeld" <Jason@...c4.com>,
Masahiro Yamada <masahiroy@...nel.org>
Subject: Re: Detecting default signedness of char in ext4 (despite -funsigned-char)
On Wed, Jan 18, 2023 at 2:20 PM Andreas Dilger <adilger@...ger.ca> wrote:
>
> It makes sense to use the existing UNSIGNED/SIGNED flag in the superblock
> for the dir hash also for the xattr hash.
No. As Eric says, the existing flag for this - while related to a very
similar issue - just doesn't work.
It's for a different hash, and the flag has been set (or not) based on
previous history of that particular filesystem that isn't necessarily
at all relevant. The "signedness" of the xattrs hash simply doesn't
necessarily match the signedness of the existing flag.
Also, honestly, it's just entirely POINTLESS.
Here's the deal: these kinds of flags simply SHOULD NOT EXIST. A
filesystem that has dynamic byte order flags is an INCOMPETENT
filesystem. Yes, I know they exist. The people involved should be
ashamed of them.
For things like byte order flags, it's simply unambiguously better to
just have an unconditional on-disk byte order, even when that byte
order doesn't match the "native" byte order of the machine. Doing an
unconditional 'bswap' operation is literally smaller, simpler and
faster than conditionally doing nothing, and makes for much easier
verification (because you can use static typing rules, like we use for
"__le32" and types like that).
The EXACT same thing is true of architecture-specific signedness.
Yes, you can add flags to dynamically choose one or the other, but
that's actively *worse* then just picking one statically and saying
"this is the correct one".
And honestly, the unsigned version was always the correct one. It's
not even half-way ambiguous like the whole little-endian vs big-endian
discussion used to be. In this very thread Ted made clear that he
hadn't even realized that this signed case could happen and would
matter. The signed case is surprising and unintentional.
So, given that
(a) it's just a *fact* that the static choice is better
(b) of the two choices, one is clearly correct
I'm just going to put my foot down and say that right way forward is
clearly to just make that be the de-facto rule.
And honestly, it's the simpler patch too, since "-funsigned-char" has
already done all the actual "pick the right hash" for us.
We have four levels of "how much do we care" and simplicity:
(1) do nothing, see if anybody actually ever notices outside of generic/454
(2) just remove the hash check entirely, it's not doing anything
semantically meaningful
(3) keep the hash check, replace the "return -EFSCORRUPTED" with a
"pr_warn_once()" and returning zero
(4) the "keep the hash check, if it fails, test it the other way,
always write the correct new hash" patch I already posted
but the thing I refuse to do is to either just maintain the bug as-is
with extra complexity (Eric's patch) or add a new flag (or erroneously
re-use an existing flag) to make it dynamic.
The e2fsck argument is bugus, because *every* case requires e2fsck
changes. Even if we maintain the completely bogus historical behavior
of "the on-disk filesystem representation depends on architecture",
e2fsck had better start warning about it.
And again, I guarantee that the "fix the signedness" is the simpler
thing even for e2fsck, since at least then there is one unambiguously
correct version and one incorrect one, instead of some wishy-washy
"oh, this filesystem is correct for *THESE* architectures".
I'm attaching my suggested patch here again - this time with a commit
message - because while I'm ok with any of options 1-4 above, I
already wrote the patch for the most complicated case, and it's not
*that* complex.
We could possibly add a "pr_warn_once()" so that we have a combination
of (3) and (4).
In the longer term, we should
(a) remove the #ifdef __CHAR_UNSIGNED__ in ext4 as always true
(happily, I grepped - it's the only case in the whole source tree)
(b) probably plan on _eventually_ just remove my disgusting
"calculate signed version" thing, once we think it's all blown over
and we have never seen one of those pr_warn_on() messages outside of
generic/454
but I *really* don't want to add some new filesystem flag to deal with
this situation when a non-flag version is just simpler.
Hmm?
Linus
View attachment "0001-ext4-deal-with-legacy-signed-xattr-name-hash-values.patch" of type "text/x-patch" (3564 bytes)
Powered by blists - more mailing lists