[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <736ca451-8adc-4c5c-b721-6b78eaeb4699@linux.ibm.com>
Date: Fri, 11 Apr 2025 17:06:55 +0530
From: Donet Tom <donettom@...ux.ibm.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
linux-kernel@...r.kernel.org, David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
Ritesh Harjani <ritesh.list@...il.com>, rafael@...nel.org,
Danilo Krummrich <dakr@...nel.org>
Subject: Re: [PATCH 2/2] base/node: Use
curr_node_memblock_intersect_memory_block to Get Memory Block NID if
CONFIG_DEFERRED_STRUCT_PAGE_INIT is Set
On 4/11/25 4:29 PM, Mike Rapoport wrote:
> On Fri, Apr 11, 2025 at 12:27:28AM +0530, Donet Tom wrote:
>> On 4/10/25 1:37 PM, Mike Rapoport wrote:
>>> On Wed, Apr 09, 2025 at 10:57:57AM +0530, Donet Tom wrote:
>>>> In the current implementation, when CONFIG_DEFERRED_STRUCT_PAGE_INIT is
>>>> set, we iterate over all PFNs in the memory block and use
>>>> early_pfn_to_nid to find the NID until a match is found.
>>>>
>>>> This patch we are using curr_node_memblock_intersect_memory_block() to
>>>> check if the current node's memblock intersects with the memory block
>>>> passed when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. If an intersection
>>>> is found, the memory block is added to the current node.
>>>>
>>>> If CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set, the existing mechanism
>>>> for finding the NID will continue to be used.
>>> I don't think we really need different mechanisms for different settings of
>>> CONFIG_DEFERRED_STRUCT_PAGE_INIT.
>>>
>>> node_dev_init() runs after all struct pages are already initialized and can
>>> always use pfn_to_nid().
>>
>> In the current implementation, if CONFIG_DEFERRED_STRUCT_PAGE_INIT
>> is enabled, we perform a binary search in the memblock region to
>> determine the pfn's nid. Otherwise, we use pfn_to_nid() to obtain
>> the pfn's nid.
>>
>> Your point is that we could unify this logic and always use
>> pfn_to_nid() to determine the pfn's nid, regardless of whether
>> CONFIG_DEFERRED_STRUCT_PAGE_INIT is set. Is that
>> correct?
> Yes, struct pages should be ready by the time node_dev_init() is called
> even when CONFIG_DEFERRED_STRUCT_PAGE_INIT is set.
ok.
Thanks Mike.
>
>>> kernel_init_freeable() ->
>>> page_alloc_init_late(); /* completes initialization of deferred pages */
>>> ...
>>> do_basic_setup() ->
>>> driver_init() ->
>>> node_dev_init();
>>>
>>> The next step could be refactoring register_mem_block_under_node_early() to
>>> loop over memblock regions rather than over pfns.
>> So it the current implementation
>>
>> node_dev_init()
>> register_one_node
>> register_memory_blocks_under_node
>> walk_memory_blocks()
>> register_mem_block_under_node_early
>> get_nid_for_pfn
>>
>> We get each node's start and end PFN from the pg_data. Using these
>> values, we determine the memory block's start and end within the
>> current node. To identify the node to which these memory block
>> belongs,we iterate over each PFN in the range.
>>
>> The problem I am facing is,
>>
>> In my system node4 has a memory block ranging from memory30351
>> to memory38524, and memory128433. The memory blocks between
>> memory38524 and memory128433 do not belong to this node.
>>
>> In walk_memory_blocks() we iterate over all memory blocks starting
>> from memory38524 to memory128433.
>> In register_mem_block_under_node_early(), up to memory38524, the
>> first pfn correctly returns the corresponding nid and the function
>> returns from there. But after memory38524 and until memory128433,
>> the loop iterates through each pfn and checks the nid. Since the nid
>> does not match the required nid, the loop continues. This causes
>> the soft lockups.
>>
>> This issue occurs only when CONFIG_DEFERRED_STRUCT_PAGE_INIT
>> is enabled, as a binary search is used to determine the PFN's nid. When
>> this configuration is disabled, pfn_to_nid is faster, and the issue does
>> not seen.( Faster because nid is getting from page)
>>
>> To speed up the code when CONFIG_DEFERRED_STRUCT_PAGE_INIT
>> is enabled, I added this function that iterates over all memblock regions
>> for each memory block to determine its nid.
>>
>> "Loop over memblock regions instead of iterating over PFNs" -
>> My question is - in register_one_node, do you mean that we should iterate
>> over all memblock regions, identify the regions belonging to the current
>> node, and then retrieve the corresponding memory blocks to register them
>> under that node?
> I looked more closely at register_mem_block_under_node_early() and
> iteration over memblock regions won't make sense there.
>
> It might make sense to use for_each_mem_range() as top level loop in
> node_dev_init(), but that's a separate topic.
Yes, this makes sense to me as well. So in your opinion, instead of adding
a new memblock search function like I added , it's better to use
|for_each_mem_range()| in|node_dev_init()|, which would work for all
cases—regardless of whether|CONFIG_DEFERRED_STRUCT_PAGE_INIT| is set or
not. Right?
>
>> Thanks
>> Donet
>>
Powered by blists - more mailing lists