[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<eeb9d580a41cb314aba6ad21e751b506dc9cc434.camel@cyberus-technology.de>
Date: Fri, 21 Feb 2025 14:16:31 +0000
From: Thomas Prescher <thomas.prescher@...erus-technology.de>
To: "willy@...radead.org" <willy@...radead.org>
CC: "linux-mm@...ck.org" <linux-mm@...ck.org>, "corbet@....net"
<corbet@....net>, "muchun.song@...ux.dev" <muchun.song@...ux.dev>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
Subject: Re: [PATCH 1/2] mm: hugetlb: add hugetlb_alloc_threads cmdline option
On Fri, 2025-02-21 at 13:52 +0000, Matthew Wilcox wrote:
> I don't think we should add a command line option (ie blame the
> sysadmin
> for getting it wrong). Instead, we should figure out the right
> number.
> Is it half the number of threads per socket? A quarter? 90%? It's
> bootup, the threads aren't really doing anything else. But we
> should figure it out, not the sysadmin.
I don't think we will find a number that delivers the best performance
on every system out there. With the two systems we tested, we already
see some differences.
The Skylake servers have 36 threads per socket and deliver the best
performance when we use 8 threads which is 22%. Using more threads
decreases the performance.
On Cascade Lake with 48 threads per socket, we see the best performance
when using 32 threads which is 66%. Using more threads also decreases
the performance here (not included in the table obove). The performance
benefits of using more than 8 threads are very marginal though.
I'm completely open to change the default so something that makes more
sense. From the experiments we did so far, 25% of the threads per node
deliver a reasonable good performance. We could still keep the
parameter for sysadmins that want to micro-optimize the bootup time
though.
Powered by blists - more mailing lists