[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2541b182-1f1c-c1ed-c15c-9d71160eb6fe@redhat.com>
Date: Wed, 17 Mar 2021 12:20:15 +0100
From: David Hildenbrand <david@...hat.com>
To: Anshuman Khandual <anshuman.khandual@....com>, linux-mm@...ck.org
Cc: Russell King <linux@...linux.org.uk>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Rapoport <rppt@...nel.org>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC] mm: Enable generic pfn_valid() to handle early sections
with memmap holes
On 11.03.21 05:29, Anshuman Khandual wrote:
>
>
> On 3/8/21 2:07 PM, David Hildenbrand wrote:
>> On 08.03.21 04:27, Anshuman Khandual wrote:
>>> Platforms like arm and arm64 have redefined pfn_valid() because their early
>>> memory sections might have contained memmap holes caused by memblock areas
>>> tagged with MEMBLOCK_NOMAP, which should be skipped while validating a pfn
>>> for struct page backing. This scenario could be captured with a new option
>>> CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES and then generic pfn_valid() can be
>>> improved to accommodate such platforms. This reduces overall code footprint
>>> and also improves maintainability.
>>>
>>> Commit 4f5b0c178996 ("arm, arm64: move free_unused_memmap() to generic mm")
>>> had used CONFIG_HAVE_ARCH_PFN_VALID to gate free_unused_memmap(), which in
>>> turn had expanded its scope to new platforms like arc and m68k. Rather lets
>>> restrict back the scope for free_unused_memmap() to arm and arm64 platforms
>>> using this new config option i.e CONFIG_HAVE_EARLY_SECTION_MEMMAP.
>>>
>>> While here, it exports the symbol memblock_is_map_memory() to build drivers
>>> that depend on pfn_valid() but does not have the required visibility. After
>>> this new config is in place, just drop CONFIG_HAVE_ARCH_PFN_VALID from both
>>> arm and arm64 platforms.
>>>
>>> Cc: Russell King <linux@...linux.org.uk>
>>> Cc: Catalin Marinas <catalin.marinas@....com>
>>> Cc: Will Deacon <will@...nel.org>
>>> Cc: Andrew Morton <akpm@...ux-foundation.org>
>>> Cc: Mike Rapoport <rppt@...nel.org>
>>> Cc: David Hildenbrand <david@...hat.com>
>>> Cc: linux-arm-kernel@...ts.infradead.org
>>> Cc: linux-kernel@...r.kernel.org
>>> Cc: linux-mm@...ck.org
>>> Suggested-by: David Hildenbrand <david@...hat.com>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@....com>
>>> ---
>>> This applies on 5.12-rc2 along with arm64 pfn_valid() fix patches [1] and
>>> has been lightly tested on the arm64 platform. The idea to represent this
>>> unique situation on the arm and arm64 platforms with a config option was
>>> proposed by David H during an earlier discussion [2]. This still does not
>>> build on arm platform due to pfn_valid() resolution errors. Nonetheless
>>> wanted to get some early feedback whether the overall approach here, is
>>> acceptable or not.
>>
>> It might make sense to keep the arm variant for now. The arm64 variant is where the magic happens and where we missed updates when working on the generic variant.
>
> Sure, will drop the changes on arm.
>
>>
>> The generic variant really only applies to 64bit targets where we have SPARSEMEM. See x86 as an example.
>
> Okay.
>
>>
>> [...]
>>
>>> /*
>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>> index 47946cec7584..93532994113f 100644
>>> --- a/include/linux/mmzone.h
>>> +++ b/include/linux/mmzone.h
>>> @@ -1409,8 +1409,23 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
>>> }
>>> #endif
>>> +bool memblock_is_map_memory(phys_addr_t addr);
>>> +
>>> #ifndef CONFIG_HAVE_ARCH_PFN_VALID
>>> static inline int pfn_valid(unsigned long pfn)
>>> +{
>>> + phys_addr_t addr = PFN_PHYS(pfn);
>>> +
>>> + /*
>>> + * Ensure the upper PAGE_SHIFT bits are clear in the
>>> + * pfn. Else it might lead to false positives when
>>> + * some of the upper bits are set, but the lower bits
>>> + * match a valid pfn.
>>> + */
>>> + if (PHYS_PFN(addr) != pfn)
>>> + return 0;
>>
>> I think this should be fine for other archs as well.
>>
>>> +
>>> +#ifdef CONFIG_SPARSEMEM
>>
>> Why do we need the ifdef now? If that's to cover the arm case, then please consider the arm64 case only for now.
>
> Yes, it is not needed.
>
>>
>>> {
>>> struct mem_section *ms;
>>> @@ -1423,7 +1438,14 @@ static inline int pfn_valid(unsigned long pfn)
>>> * Traditionally early sections always returned pfn_valid() for
>>> * the entire section-sized span.
>>> */
>>> - return early_section(ms) || pfn_section_valid(ms, pfn);
>>> + if (early_section(ms))
>>> + return IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ?
>>> + memblock_is_map_memory(pfn << PAGE_SHIFT) : 1;
>>> +
>>> + return pfn_section_valid(ms, pfn);
>>> +}
>>> +#endif
>>> + return 1;
>>> }
>>> #endif
>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>> index 24c045b24b95..0ec20f661b3f 100644
>>> --- a/mm/Kconfig
>>> +++ b/mm/Kconfig
>>> @@ -135,6 +135,16 @@ config HAVE_FAST_GUP
>>> config ARCH_KEEP_MEMBLOCK
>>> bool
>>> +config HAVE_EARLY_SECTION_MEMMAP_HOLES
>>> + depends on ARCH_KEEP_MEMBLOCK && SPARSEMEM_VMEMMAP
>>> + def_bool n
>>> + help
>>> + Early sections on certain platforms might have portions which are
>>> + not backed with struct page mapping as their memblock entries are
>>> + marked with MEMBLOCK_NOMAP. When subscribed, this option enables
>>> + specific handling for those memory sections in certain situations
>>> + such as pfn_valid().
>>> +
>>> # Keep arch NUMA mapping infrastructure post-init.
>>> config NUMA_KEEP_MEMINFO
>>> bool
>>> diff --git a/mm/memblock.c b/mm/memblock.c
>>> index afaefa8fc6ab..d9fa2e62ab7a 100644
>>> --- a/mm/memblock.c
>>> +++ b/mm/memblock.c
>>> @@ -1744,6 +1744,7 @@ bool __init_memblock memblock_is_map_memory(phys_addr_t addr)
>>> return false;
>>> return !memblock_is_nomap(&memblock.memory.regions[i]);
>>> }
>>> +EXPORT_SYMBOL(memblock_is_map_memory);
>>> int __init_memblock memblock_search_pfn_nid(unsigned long pfn,
>>> unsigned long *start_pfn, unsigned long *end_pfn)
>>> @@ -1926,7 +1927,7 @@ static void __init free_unused_memmap(void)
>>> unsigned long start, end, prev_end = 0;
>>> int i;
>>> - if (!IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) ||
>>> + if (!IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES) ||
>>> IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP))
>>> return;
>>>
>>
>> With
>>
>> commit 1f90a3477df3ff1a91e064af554cdc887c8f9e5e
>> Author: Dan Williams <dan.j.williams@...el.com>
>> Date: Thu Feb 25 17:17:05 2021 -0800
>>
>> mm: teach pfn_to_online_page() about ZONE_DEVICE section collisions
>>
>> (still in -next I think)
>
> Already has merged mainline.
>
>>
>> You'll also have to take care of pfn_to_online_page().
>>
>
> Something like this would work ?
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 5ba51a8bdaeb..19812deb807f 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -309,6 +309,11 @@ struct page *pfn_to_online_page(unsigned long pfn)
> * Save some code text when online_section() +
> * pfn_section_valid() are sufficient.
> */
> + if (IS_ENABLED(CONFIG_HAVE_EARLY_SECTION_MEMMAP_HOLES)) {
> + if (early_section(ms) && !memblock_is_map_memory(PFN_PHYS(pfn)))
> + return NULL;
> + }
> +
> if (IS_ENABLED(CONFIG_HAVE_ARCH_PFN_VALID) && !pfn_valid(pfn))
> return NULL;
>
Sorry for the late reply, just stumbled over this again.
I think, yes. I do wonder if we then still need the
CONFIG_HAVE_ARCH_PFN_VALID handling below - are there any custom
pfn_valid() implementation with SPARSE remaining? I don't think so.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists