[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8abecd5b-2768-49d0-afc3-561b95d77a24@redhat.com>
Date: Wed, 4 Jun 2025 15:30:05 +0200
From: David Hildenbrand <david@...hat.com>
To: Donet Tom <donettom@...ux.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: Mike Rapoport <rppt@...nel.org>, Oscar Salvador <osalvador@...e.de>,
Zi Yan <ziy@...dia.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Ritesh Harjani <ritesh.list@...il.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, "Rafael J . Wysocki" <rafael@...nel.org>,
Danilo Krummrich <dakr@...nel.org>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Alison Schofield <alison.schofield@...el.com>,
Yury Norov <yury.norov@...il.com>, Dave Jiang <dave.jiang@...el.com>,
Madhavan Srinivasan <maddy@...ux.ibm.com>, Nilay Shroff
<nilay@...ux.ibm.com>, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH v7 1/5] drivers/base/node: Optimize memory block
registration to reduce boot time
On 04.06.25 15:17, Donet Tom wrote:
>
> On 6/4/25 3:15 PM, David Hildenbrand wrote:
>> On 04.06.25 05:07, Andrew Morton wrote:
>>> On Wed, 28 May 2025 12:18:00 -0500 Donet Tom <donettom@...ux.ibm.com>
>>> wrote:
>>>
>>>> During node device initialization, `memory blocks` are registered under
>>>> each NUMA node. The `memory blocks` to be registered are identified
>>>> using
>>>> the node’s start and end PFNs, which are obtained from the node's
>>>> pg_data
>>>
>>> It's quite unconventional to omit the [0/N] changelog. This omission
>>> somewhat messed up my processes so I added a one-liner to this.
>>>
>>
>> Yeah, I was assuming that I simply did not get cc'ed on the cover
>> letter, but there is actually none.
>>
>> Donet please add that in the future. git can do this using
>> --cover-letter.
>
> Sure,
>
> I will add cover letter in next revision.
>
>
>>
>>>>
>>>> ...
>>>>
>>>> Test Results on My system with 32TB RAM
>>>> =======================================
>>>> 1. Boot time with CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled.
>>>>
>>>> Without this patch
>>>> ------------------
>>>> Startup finished in 1min 16.528s (kernel)
>>>>
>>>> With this patch
>>>> ---------------
>>>> Startup finished in 17.236s (kernel) - 78% Improvement
>>>
>>> Well someone is in for a nice surprise.
>>>
>>>> 2. Boot time with CONFIG_DEFERRED_STRUCT_PAGE_INIT disabled.
>>>>
>>>> Without this patch
>>>> ------------------
>>>> Startup finished in 28.320s (kernel)
>>>
>>> what. CONFIG_DEFERRED_STRUCT_PAGE_INIT is supposed to make bootup
>>> faster.
>>
>> Right, that's weird. Especially that it is still slower after these
>> changes.
>>
>> CONFIG_DEFERRED_STRUCT_PAGE_INIT should be initializing in parallel
>> which ... should be faster.
>>
>> @Donet, how many CPUs and nodes does your system have? Can you
>> identify what is taking longer than without
>> CONFIG_DEFERRED_STRUCT_PAGE_INIT?
>
>
>
> My system has,
>
> CPU - 1528
Holy cow.
Pure speculation: are we parallelizing *too much* ? :)
That's ~95 CPUs per node on average.
Staring at deferred_init_memmap(), we do have
max_threads = deferred_page_init_max_threads(cpumask);
And that calls cpumask_weight(), essentially using all CPUs on the node.
... not sure what exactly happens if there are no CPUs for a node.
> Node - 16
Are any of these memory-less?
> Memory - 31TB
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists