lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTinZ=Bk53KCr4_8Vjpb6M+RWq6n2XCz=rY2DOLRx@mail.gmail.com>
Date:	Thu, 16 Dec 2010 08:51:04 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Boaz Harrosh <bharrosh@...asas.com>
Cc:	David Miller <davem@...emloft.net>, npiggin@...il.com,
	hooanon05@...oo.co.jp, npiggin@...nel.dk,
	linux-arch@...r.kernel.org, x86@...nel.org,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: Big git diff speedup by avoiding x86 "fast string" memcmp

On Thu, Dec 16, 2010 at 1:53 AM, Boaz Harrosh <bharrosh@...asas.com> wrote:
>
> You miss understood me. I'm saying that we know the beggining of the
> string is aligned and Nick offered to pad the last long, so surly
> a shift by 2 (or 3) + the reduction of the 12 dec-and-test to 3
> should give you an optimization?

Sadly, right now we don't know that the string is necessarily even aligned.

Yes, it's always aligned in a dentry, because it's either the inline
short string, or it's the longer string we explicitly allocated to the
dentry.

But when we do name compares in __d_lookup, only one part of that is a
dentry. The other is a qstr, and the name there is not aligned. In
fact, it's not even NUL-terminated. It's the data directly from the
path itself.

So we can certainly do compares a "long" at a time, but it's not
entirely trivial. And just making the dentries be aligned and
null-padded is not enough. Most likely, you'd have to make the dentry
name compare function do an unaligned load from the qstr part, and
then do the masking.

Which is likely still the best performance on something like x86 where
unaligned loads are cheap, but on other architectures it might be less
so.

                                     Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ