lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <eeb9d580a41cb314aba6ad21e751b506dc9cc434.camel@cyberus-technology.de>
Date: Fri, 21 Feb 2025 14:16:31 +0000
From: Thomas Prescher <thomas.prescher@...erus-technology.de>
To: "willy@...radead.org" <willy@...radead.org>
CC: "linux-mm@...ck.org" <linux-mm@...ck.org>, "corbet@....net"
	<corbet@....net>, "muchun.song@...ux.dev" <muchun.song@...ux.dev>,
	"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
Subject: Re: [PATCH 1/2] mm: hugetlb: add hugetlb_alloc_threads cmdline option

On Fri, 2025-02-21 at 13:52 +0000, Matthew Wilcox wrote:
> I don't think we should add a command line option (ie blame the
> sysadmin
> for getting it wrong).  Instead, we should figure out the right
> number.
> Is it half the number of threads per socket?  A quarter?  90%?  It's
> bootup, the threads aren't really doing anything else.  But we
> should figure it out, not the sysadmin.

I don't think we will find a number that delivers the best performance
on every system out there. With the two systems we tested, we already
see some differences.

The Skylake servers have 36 threads per socket and deliver the best
performance when we use 8 threads which is 22%. Using more threads
decreases the performance.

On Cascade Lake with 48 threads per socket, we see the best performance
when using 32 threads which is 66%. Using more threads also decreases
the performance here (not included in the table obove). The performance
benefits of using more than 8 threads are very marginal though.

I'm completely open to change the default so something that makes more
sense. From the experiments we did so far, 25% of the threads per node
deliver a reasonable good performance. We could still keep the
parameter for sysadmins that want to micro-optimize the bootup time
though.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ