lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 4 Jan 2021 11:48:18 +0530
From:   Anshuman Khandual <anshuman.khandual@....com>
To:     David Hildenbrand <david@...hat.com>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Cc:     catalin.marinas@....com, will@...nel.org, ardb@...nel.org,
        Mark Rutland <mark.rutland@....com>,
        James Morse <james.morse@....com>,
        Robin Murphy <robin.murphy@....com>,
        Jérôme Glisse <jglisse@...hat.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Michael Ellerman <michael@...erman.id.au>
Subject: Re: [RFC 1/2] arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory


On 12/22/20 2:41 PM, David Hildenbrand wrote:
> On 22.12.20 08:12, Anshuman Khandual wrote:
>> pfn_valid() validates a pfn but basically it checks for a valid struct page
>> backing for that pfn. It should always return positive for memory ranges
>> backed with struct page mapping. But currently pfn_valid() fails for all
>> ZONE_DEVICE based memory types even though they have struct page mapping.
>>
>> pfn_valid() asserts that there is a memblock entry for a given pfn without
>> MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is
>> that they do not have memblock entries. Hence memblock_is_map_memory() will
>> invariably fail via memblock_search() for a ZONE_DEVICE based address. This
>> eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs
>> to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged
>> into the system via memremap_pages() called from a driver, their respective
>> memory sections will not have SECTION_IS_EARLY set.
>>
>> Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock
>> regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set
>> for firmware reserved memory regions. memblock_is_map_memory() can just be
>> skipped as its always going to be positive and that will be an optimization
>> for the normal hotplug memory. Like ZONE_DEVIE based memory, all hotplugged
>> normal memory too will not have SECTION_IS_EARLY set for their sections.
>>
>> Skipping memblock_is_map_memory() for all non early memory sections would
>> fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its
>> performance for normal hotplug memory as well.
>>
>> Cc: Catalin Marinas <catalin.marinas@....com>
>> Cc: Will Deacon <will@...nel.org>
>> Cc: Ard Biesheuvel <ardb@...nel.org>
>> Cc: Robin Murphy <robin.murphy@....com>
>> Cc: linux-arm-kernel@...ts.infradead.org
>> Cc: linux-kernel@...r.kernel.org
>> Fixes: 73b20c84d42d ("arm64: mm: implement pte_devmap support")
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@....com>
>> ---
>>  arch/arm64/mm/init.c | 12 ++++++++++++
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 75addb36354a..ee23bda00c28 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -225,6 +225,18 @@ int pfn_valid(unsigned long pfn)
>>  
>>  	if (!valid_section(__pfn_to_section(pfn)))
>>  		return 0;
>> +
>> +	/*
>> +	 * ZONE_DEVICE memory does not have the memblock entries.
>> +	 * memblock_is_map_memory() check for ZONE_DEVICE based
>> +	 * addresses will always fail. Even the normal hotplugged
>> +	 * memory will never have MEMBLOCK_NOMAP flag set in their
>> +	 * memblock entries. Skip memblock search for all non early
>> +	 * memory sections covering all of hotplug memory including
>> +	 * both normal and ZONE_DEVIE based.
>> +	 */
>> +	if (!early_section(__pfn_to_section(pfn)))
>> +		return 1;
> 
> Actually, I think we want to check for partial present sections.
> 
> Maybe we can rather switch to generic pfn_valid() and tweak it to
> something like
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index fb3bf696c05e..7b1fcce5bd5a 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1382,9 +1382,13 @@ static inline int pfn_valid(unsigned long pfn)
>                 return 0;
>         /*
>          * Traditionally early sections always returned pfn_valid() for
> -        * the entire section-sized span.
> +        * the entire section-sized span. Some archs might have holes in
> +        * early sections, so double check with memblock if configured.
>          */
> -       return early_section(ms) || pfn_section_valid(ms, pfn);
> +       if (early_section(ms))
> +               return IS_ENABLED(CONFIG_EARLY_SECTION_MEMMAP_HOLES) ?
> +                      memblock_is_map_memory(pfn << PAGE_SHIFT) : 1;
> +       return pfn_section_valid(ms, pfn);
>  }
>  #endif

Could not find CONFIG_EARLY_SECTION_MEMMAP_HOLES. Are you suggesting to
create this config which could track platform scenarios where all early
sections might not have mmap coverage such as arm64 ?

> 
> 
> 
> Which users are remaining that require us to add/remove memblocks when
> hot(un)plugging memory
> 
>  $ git grep KEEP_MEM | grep memory_hotplug
> mm/memory_hotplug.c:    if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
> mm/memory_hotplug.c:    if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
> mm/memory_hotplug.c:    if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK)) {

Did not follow, do we want to drop ARCH_KEEP_MEMBLOCK ? Without it arm64
will not be able to track MEMBLOCK_NOMAP memory at runtime.

> 
> 
> I think one user we would have to handle is
> arch/arm64/mm/mmap.c:valid_phys_addr_range(). AFAIS, powerpc at least
> does not rely on memblock_is_map_memory.

memblock_is_map_memory() is currently used only on arm/arm64 platforms.
Apart from the above example in valid_phys_addr_range(), there are some
other memblock_is_map_memory() call sites as well. But then, we are not
trying to completely drop memblock_is_map_memory() or are we ?

Powered by blists - more mailing lists