lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Fri, 21 Jun 2024 13:53:27 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: Christian Brauner <brauner@...nel.org>, Al Viro <viro@...iv.linux.org.uk>, 
	linux-fsdevel <linux-fsdevel@...r.kernel.org>, "the arch/x86 maintainers" <x86@...nel.org>, 
	Linux ARM <linux-arm-kernel@...ts.infradead.org>, 
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, kernel test robot <lkp@...el.com>
Subject: Re: FYI: path walking optimizations pending for 6.11

On Fri, 21 Jun 2024 at 13:04, Matthew Wilcox <willy@...radead.org> wrote:
>
> What I was reacting to in your email was this:
>
> : And on my arm64 machine, it turns out that the best optimization for the
> : load I tested would be to make that hash table smaller to actually be a
> : bit denser in the cache, But that's such a load-dependent optimization
> : that I'm not doing this.
>
> And that's exactly what rosebush does; it starts out incredibly small
> (512 bytes) and then resizes as the buckets overflow.  So if you suspect
> that a denser hashtable would give you better performance, then maybe
> it'll help.

Well, I was more going "ok, on the exact load _I_ was running, it
would probably help to use a smaller hash table", but I suspect that
in real life our actual hash tables are better.

My benchmark is somewhat real-world in that yes, I benchmark what I
do. But what I do is ridiculously limited. Using git and building
kernels and running a web browser for email does not require 64GB of
RAM.

But that's what I have in what is now my "small" machine, literally
because I wanted to populate every memory channel.

Not because I needed the size, but because I wanted the memory channel
bandwidth.

IOW, my machines tend to be completely over-specced wrt memory. The
kernel build can use about as many cores as you can throw at it, but
even with multiple trees, and everything cached, and hundreds of
parallel compilers going, I just don't use that much RAM. The kernel
build system is pretty damn lean (ask the poor people who do GUI tools
with C++ and the situation changes, but the kernel build is actually
pretty good on resource use).

So the kernel - very reasonably - provisions me with a big hash table,
because I literally have memory to waste.

And it turns out that since _all_ I do on the arm64 box in particular
(it's headless, so not even a web browser) is to build kernels. I
could "tweak" the config for that.

But while it might benchmark better, it would likely not be better in reality.

I'm going to be on the road this weekend, but if you have something
that you think is past the "debug build" stage and is worth
benchmarking, I can try to run it on my machines next week.

              Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ