[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFzg7Wm6hNyFLrtPB5HOdu0SSkZrbLXFJrgreEN+LYEsdg@mail.gmail.com>
Date: Sat, 17 Mar 2012 09:53:37 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Sven Anderson <sven@...erson.de>
Cc: Chris Mason <chris.mason@...cle.com>,
Al Viro <viro@...iv.linux.org.uk>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: Faulty has_zero()? (was: .. anybody know of any filesystems that
depend on the exact VFS 'namehash' implementation?)
On Sat, Mar 17, 2012 at 5:29 AM, Sven Anderson <sven@...erson.de> wrote:
>
> Am 01.03.2012 um 23:42 schrieb Linus Torvalds:
>
>> +/* Return the high bit set in the first byte that is a zero */
>> +static inline unsigned long has_zero(unsigned long a)
>> +{
>> + return ((a - ONEBYTES) & ~a) & HIGHBITS;
>> +}
>
> (I commented this on your google+ posting as well, but I'm not sure if you will notice it there.)
>
> Out of curiosity I studied your code, and if I'm not mistaken your has_zero() function doesn't do what is expected. If there are leading 0x01 bytes in front of a NUL byte, they are also marked in the mask because of the borrow bit.
So has_zero() doesn't guarantee to return a mask of all zero bytes -
only the *first* one. And that's the only one we care about. Any
subsequent zero bytes are suspect, but we don't use that information -
we mask it all away with the "(mask-1) & ~mask"
So what has_zero() guarantees is two-fold:
- if there are no zeroes at all, it will return 0.
- if there is one or more zero bytes, it will return with the high
bit set in the *first* zero byte (and nothing below that).
But if it returns non-zero, only the first bit is "guaranteed". Any of
the high bits in the higher bytes are indeed suspect, exactly because
of borrow.
But for our use, we simply just don't care - we only ever care about
the *first* NUL byte.
This is why the algorithm is fundamentally little-endian only. You may
have confused your test-case on a big-endian machine. On big-endian,
you would need to do a byte-switching load.
And when we do the "NUL or slash" thing, we will or the two cases
together, which means that we catch the first NUL or slash, and even
though *both* of those are suspect in the higher bits, once more, we
won't care. We will only ever look at the *lowest* of the high bits in
the bytes (that whole , which is exactly why borrow will never matter
for us: it can only affect bits above the lowest one.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists