lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 19 Jun 2024 15:08:47 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: Christian Brauner <brauner@...nel.org>, Al Viro <viro@...iv.linux.org.uk>, 
	linux-fsdevel <linux-fsdevel@...r.kernel.org>, "the arch/x86 maintainers" <x86@...nel.org>, 
	Linux ARM <linux-arm-kernel@...ts.infradead.org>, 
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: FYI: path walking optimizations pending for 6.11

On Wed, 19 Jun 2024 at 13:45, Matthew Wilcox <willy@...radead.org> wrote:
>
> Funnily, I'm working on rosebush v2 today.  It's in no shape to send out
> (it's failing ~all of its selftests) but *should* greatly improve the
> cache friendliness of the hash table.  And it's being written with the
> dcache as its first customer.

I'm interested to see if you can come up with something decent, but
I'm not hugely optimistic.

>From what I saw, you planned on comparing with rhashtable hash chains of 10.

But that's not what the dentry cache uses at all. rhashtable is way
too slow. It's been ages since I ran the numbers, but the dcache array
is just sized to be "large enough".

In fact, my comment about my workload being better if the hash table
was smaller was because we really are pretty aggressive with the
dcache hash table size. I think our scaling factor is 13 - as in "one
entry per 8kB of memory".

Which is almost certainly wasting memory, but name lookup really does
show up as a hot thing on many loads.

Anyway, what it means is that the dcache hash chain is usually *one*.
Not ten. And has none of the rhashtable overheads.

So if your "use linear lookups to make the lookup faster" depends on
comparing with ten entry chains of rhashtable, you might be in for a
very nasty surprise.

In my profiles, the first load of the hash table tends to be the
expensive one. Not the chain following.

Of course, my profiles are not only just one random load, they are
also skewed by the fact that I reboot so much. So maybe my dentry
cache just doesn't grow sufficiently big during my testing, and thus
my numbers are skewed even for just my own loads.

Benchmarking is hard.

Anyway, that was just a warning that if you're comparing against
rhashtable, you have almost certainly already lost before you even got
started.

             Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ