[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e4f806d4-2527-07c2-56bc-9c41789d669c@linux.intel.com>
Date: Tue, 16 Oct 2018 13:49:14 -0700
From: Alexander Duyck <alexander.h.duyck@...ux.intel.com>
To: Pavel Tatashin <pasha.tatashin@...il.com>, linux-mm@...ck.org,
akpm@...ux-foundation.org
Cc: pavel.tatashin@...rosoft.com, mhocko@...e.com,
dave.jiang@...el.com, linux-kernel@...r.kernel.org,
willy@...radead.org, davem@...emloft.net,
yi.z.zhang@...ux.intel.com, khalid.aziz@...cle.com,
rppt@...ux.vnet.ibm.com, vbabka@...e.cz,
sparclinux@...r.kernel.org, dan.j.williams@...el.com,
ldufour@...ux.vnet.ibm.com, mgorman@...hsingularity.net,
mingo@...nel.org, kirill.shutemov@...ux.intel.com
Subject: Re: [mm PATCH v3 2/6] mm: Drop meminit_pfn_in_nid as it is redundant
On 10/16/2018 1:33 PM, Pavel Tatashin wrote:
>
>
> On 10/15/18 4:27 PM, Alexander Duyck wrote:
>> As best as I can tell the meminit_pfn_in_nid call is completely redundant.
>> The deferred memory initialization is already making use of
>> for_each_free_mem_range which in turn will call into __next_mem_range which
>> will only return a memory range if it matches the node ID provided assuming
>> it is not NUMA_NO_NODE.
>>
>> I am operating on the assumption that there are no zones or pgdata_t
>> structures that have a NUMA node of NUMA_NO_NODE associated with them. If
>> that is the case then __next_mem_range will never return a memory range
>> that doesn't match the zone's node ID and as such the check is redundant.
>>
>> So one piece I would like to verfy on this is if this works for ia64.
>> Technically it was using a different approach to get the node ID, but it
>> seems to have the node ID also encoded into the memblock. So I am
>> assuming this is okay, but would like to get confirmation on that.
>>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@...ux.intel.com>
>
> If I am not mistaken, this code is for systems with memory interleaving.
> Quick looks shows that x86, powerpc, s390, and sparc have it set.
>
> I am not sure about other arches, but at least on SPARC, there are some
> processors with memory interleaving feature:
>
> http://www.fujitsu.com/global/products/computing/servers/unix/sparc-enterprise/technology/performance/memory.html
>
> Pavel
I get what it is for. However as best I can tell the check is actually
redundant. In the case of the deferred page initialization we are
already pulling the memory regions via "for_each_free_mem_range". That
function is already passed a NUMA node ID. Because of that we are
already checking the memory range to determine if it is in the node or
not. As such it doesn't really make sense to go through for each PFN and
then go back to the memory range and see if the node matches or not.
You can take a look at __next_mem_range which is called by
for_each_free_mem_range and passed &memblock.memory and
&memblock.reserved to avoid:
https://elixir.bootlin.com/linux/latest/source/mm/memblock.c#L899
Then you can work your way through:
meminit_pfn_in_nid(pfn, node, state)
__early_pfn_to_nid(pfn, state)
memblock_search_pfn_nid(pfn, &start_pfn, &end_pfn)
memblock_search(&memblock.memory, pfn)
From what I can tell the deferred init is going back through the
memblock.memory list we pulled this range from and just validating it
against itself. This makes sense for the standard init as that is just
going from start_pfn->end_pfn, but for the deferred init we are pulling
the memory ranges ahead of time so we shouldn't need to re-validate the
memory that is contained within that range.
Powered by blists - more mailing lists