[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <78bc6a1b-164e-4925-a624-a271a4499364@redhat.com>
Date: Thu, 8 May 2025 11:18:35 +0200
From: David Hildenbrand <david@...hat.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: Oscar Salvador <osalvador@...e.de>, Donet Tom <donettom@...ux.ibm.com>,
Zi Yan <ziy@...dia.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Andrew Morton <akpm@...ux-foundation.org>, rafael@...nel.org,
Danilo Krummrich <dakr@...nel.org>, Ritesh Harjani <ritesh.list@...il.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Alison Schofield <alison.schofield@...el.com>,
Yury Norov <yury.norov@...il.com>, Dave Jiang <dave.jiang@...el.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/3] driver/base: Optimize memory block registration to
reduce boot time
On 05.05.25 15:24, Mike Rapoport wrote:
> On Mon, May 05, 2025 at 10:18:43AM +0200, David Hildenbrand wrote:
>> On 05.05.25 09:53, Mike Rapoport wrote:
>>> On Mon, May 05, 2025 at 09:38:43AM +0200, David Hildenbrand wrote:
>>>> On 05.05.25 09:28, Oscar Salvador wrote:
>>>>> On Mon, May 05, 2025 at 09:16:48AM +0200, David Hildenbrand wrote:
>>>>>> memory hotplug code never calls register_one_node(), unless I am missing
>>>>>> something.
>>>>>>
>>>>>> During add_memory_resource(), we call __try_online_node(nid, false), meaning
>>>>>> we skip register_one_node().
>>>>>>
>>>>>> The only caller of __try_online_node(nid, true) is try_online_node(), called
>>>>>> from CPU hotplug code, and I *guess* that is not required.
>>>>>
>>>>> Well, I guess this is because we need to link the cpus to the node.
>>>>> register_one_node() has two jobs: 1) register cpus belonging to the node
>>>>> and 2) register memory-blocks belonging to the node (if any).
>>>>
>>>> Ah, via __register_one_node() ...
>>>>
>>>> I would assume that an offline node
>>>>
>>>> (1) has no memory
>>>> (2) has no CPUs
>>>>
>>>> When we *hotplug* either memory or CPUs, and we first online the node, there
>>>> is nothing to register. Because if there would be something, the node would
>>>> already be online.
>>>>
>>>> In particular, try_offline_node() will only offline a node if
>>>>
>>>> (A) No present pages: No pages are spanned anymore. This includes
>>>> offline memory blocks.
>>>> (B) No present CPUs.
>>>>
>>>> But maybe there is some case that I am missing ...
>>>
>>> I actually hoped you and Oscar know how that stuff works :)
>>
>> Well, I know how the memory side works, but the CPU side is giving me a hard
>> time :)
>>
>>>
>>> I tried to figure what is going on there and it all looks really convoluted.
>>
>> Jap ...
>>
>>>
>>> So, on boot we have
>>> cpu_up() ->
>>> try_online_node() ->
>>> bails out because all nodes are online (at least on
>>> x86 AFAIU, see 1ca75fa7f19d ("arch/x86/mm/numa: Do
>>> not initialize nodes twice"))
>>> node_dev_init()i ->
>>> register_one_node() ->
>>> this one can use __register_one_node() and loop
>>> over memblock regions.
>>>
>>> And for the hotplug/unplug path, it seems that
>>> register_memory_blocks_under_node(MEMINIT_EARLY) is superfluous, because if
>>> a node had memory it wouldn't get offlined, and if we are hotplugging an
>>> node with memory and cpus, memory hotplug anyway calls
>>> register_memory_blocks_under_node_hotplug().
>>>
>>> So, IMHO, register_one_node() should not call
>>> register_memory_blocks_under_node() at all, but again, I might have missed
>>> something :)
>>
>> Hm, but someone has to create these links for the memory blocks.
>
> My understanding that the links for the memory blocks during hotplug are created in
>
> add_memory_resource()
> register_memory_blocks_under_node()
Yes, during hotplug it's exactly that, after registering the node +
setting it online.
>
> So register_one_node() only calls register_memory_blocks_under_node() when
> there are no actual memory resources under that node, isn't it?
Except in early boot. That's why register_one_node() has:
register_memory_blocks_under_node(nid, start_pfn, end_pfn,
MEMINIT_EARLY);
^ early :)
And that is triggered by
node_dev_init()->register_one_node()
>
> Then we can drop the call to register_memory_blocks_under_node() from
> register_one_node() and add creation of memory blocks to node_dev_init(),
> i.e.
>
> node_dev_init()
> for_each_node(nid)
> __register_one_node(nid)
> for_each_mem_region()
> /* create memory block if node matches */
Yes exactly, that makes sense.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists