lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3d28858f-4ec6-43ea-8a3b-b9ce9a27bac7@linux.ibm.com>
Date: Wed, 4 Jun 2025 21:27:25 +0530
From: Donet Tom <donettom@...ux.ibm.com>
To: David Hildenbrand <david@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc: Mike Rapoport <rppt@...nel.org>, Oscar Salvador <osalvador@...e.de>,
        Zi Yan <ziy@...dia.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Ritesh Harjani <ritesh.list@...il.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, "Rafael J . Wysocki" <rafael@...nel.org>,
        Danilo Krummrich <dakr@...nel.org>,
        Jonathan Cameron <Jonathan.Cameron@...wei.com>,
        Alison Schofield <alison.schofield@...el.com>,
        Yury Norov <yury.norov@...il.com>, Dave Jiang <dave.jiang@...el.com>,
        Madhavan Srinivasan <maddy@...ux.ibm.com>,
        Nilay Shroff
 <nilay@...ux.ibm.com>, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH v7 1/5] drivers/base/node: Optimize memory block
 registration to reduce boot time


On 6/4/25 7:00 PM, David Hildenbrand wrote:
> On 04.06.25 15:17, Donet Tom wrote:
>>
>> On 6/4/25 3:15 PM, David Hildenbrand wrote:
>>> On 04.06.25 05:07, Andrew Morton wrote:
>>>> On Wed, 28 May 2025 12:18:00 -0500 Donet Tom <donettom@...ux.ibm.com>
>>>> wrote:
>>>>
>>>>> During node device initialization, `memory blocks` are registered 
>>>>> under
>>>>> each NUMA node. The `memory blocks` to be registered are identified
>>>>> using
>>>>> the node’s start and end PFNs, which are obtained from the node's
>>>>> pg_data
>>>>
>>>> It's quite unconventional to omit the [0/N] changelog.  This omission
>>>> somewhat messed up my processes so I added a one-liner to this.
>>>>
>>>
>>> Yeah, I was assuming that I simply did not get cc'ed on the cover
>>> letter, but there is actually none.
>>>
>>> Donet please add that in the future. git can do this using
>>> --cover-letter.
>>
>> Sure,
>>
>> I will add cover letter in next revision.
>>
>>
>>>
>>>>>
>>>>> ...
>>>>>
>>>>> Test Results on My system with 32TB RAM
>>>>> =======================================
>>>>> 1. Boot time with CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled.
>>>>>
>>>>> Without this patch
>>>>> ------------------
>>>>> Startup finished in 1min 16.528s (kernel)
>>>>>
>>>>> With this patch
>>>>> ---------------
>>>>> Startup finished in 17.236s (kernel) - 78% Improvement
>>>>
>>>> Well someone is in for a nice surprise.
>>>>
>>>>> 2. Boot time with CONFIG_DEFERRED_STRUCT_PAGE_INIT disabled.
>>>>>
>>>>> Without this patch
>>>>> ------------------
>>>>> Startup finished in 28.320s (kernel)
>>>>
>>>> what.  CONFIG_DEFERRED_STRUCT_PAGE_INIT is supposed to make bootup
>>>> faster.
>>>
>>> Right, that's weird. Especially that it is still slower after these
>>> changes.
>>>
>>> CONFIG_DEFERRED_STRUCT_PAGE_INIT should be initializing in parallel
>>> which ... should be faster.
>>>
>>> @Donet, how many CPUs and nodes does your system have? Can you
>>> identify what is taking longer than without
>>> CONFIG_DEFERRED_STRUCT_PAGE_INIT?
>>
>>
>>
>> My system has,
>>
>> CPU      - 1528
>
> Holy cow.
>
> Pure speculation: are we parallelizing *too much* ? :)
>
> That's ~95 CPUs per node on average.

yes

>
> Staring at deferred_init_memmap(), we do have
>
>     max_threads = deferred_page_init_max_threads(cpumask);
>
> And that calls cpumask_weight(), essentially using all CPUs on the node.
>
> ... not sure what exactly happens if there are no CPUs for a node.


Okay.

I'm still debugging what's happening. I'll update you once I find something.


>
>> Node     - 16
>
> Are any of these memory-less?


No, there are no memory-less nodes. All nodes have around 2 TB of memory.


>
>> Memory - 31TB
>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ