[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1611529781.hxjbuadzrl.astroid@bobo.none>
Date: Mon, 25 Jan 2021 09:17:18 +1000
From: Nicholas Piggin <npiggin@...il.com>
To: Christoph Hellwig <hch@...radead.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Ding Tianhong <dingtianhong@...wei.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linuxppc-dev@...ts.ozlabs.org,
Zefan Li <lizefan@...wei.com>,
Rick Edgecombe <rick.p.edgecombe@...el.com>,
Randy Dunlap <rdunlap@...radead.org>
Subject: Re: [PATCH v10 11/12] mm/vmalloc: Hugepage vmalloc mappings
Excerpts from Christoph Hellwig's message of January 25, 2021 1:07 am:
> On Sun, Jan 24, 2021 at 06:22:29PM +1000, Nicholas Piggin wrote:
>> diff --git a/arch/Kconfig b/arch/Kconfig
>> index 24862d15f3a3..f87feb616184 100644
>> --- a/arch/Kconfig
>> +++ b/arch/Kconfig
>> @@ -724,6 +724,16 @@ config HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>> config HAVE_ARCH_HUGE_VMAP
>> bool
>>
>> +config HAVE_ARCH_HUGE_VMALLOC
>> + depends on HAVE_ARCH_HUGE_VMAP
>> + bool
>> + help
>> + Archs that select this would be capable of PMD-sized vmaps (i.e.,
>> + arch_vmap_pmd_supported() returns true), and they must make no
>> + assumptions that vmalloc memory is mapped with PAGE_SIZE ptes. The
>> + VM_NOHUGE flag can be used to prohibit arch-specific allocations from
>> + using hugepages to help with this (e.g., modules may require it).
>
> help texts don't make sense for options that aren't user visible.
Yeah it was supposed to just be a comment but if it was user visible
then similar kind of thing would not make sense in help text, so I'll
just turn it into a real comment as per Randy's suggestion.
> More importantly, is there any good reason to keep the option and not
> just go the extra step and enable huge page vmalloc for arm64 and x86
> as well?
Yes they need to ensure they exclude vmallocs that can't be huge one
way or another (VM_ flag or prot arg).
After they're converted we can fold it into HUGE_VMAP.
>> +static inline bool is_vm_area_hugepages(const void *addr)
>> +{
>> + /*
>> + * This may not 100% tell if the area is mapped with > PAGE_SIZE
>> + * page table entries, if for some reason the architecture indicates
>> + * larger sizes are available but decides not to use them, nothing
>> + * prevents that. This only indicates the size of the physical page
>> + * allocated in the vmalloc layer.
>> + */
>> + return (find_vm_area(addr)->page_order > 0);
>
> No need for the braces here.
>
>> }
>>
>> +static int vmap_pages_range_noflush(unsigned long addr, unsigned long end,
>> + pgprot_t prot, struct page **pages, unsigned int page_shift)
>> +{
>> + unsigned int i, nr = (end - addr) >> PAGE_SHIFT;
>> +
>> + WARN_ON(page_shift < PAGE_SHIFT);
>> +
>> + if (page_shift == PAGE_SHIFT)
>> + return vmap_small_pages_range_noflush(addr, end, prot, pages);
>
> This begs for a IS_ENABLED check to disable the hugepage code for
> architectures that don't need it.
Yeah good point.
>> +int map_kernel_range_noflush(unsigned long addr, unsigned long size,
>> + pgprot_t prot, struct page **pages)
>> +{
>> + return vmap_pages_range_noflush(addr, addr + size, prot, pages, PAGE_SHIFT);
>> +}
>
> Please just kill off map_kernel_range_noflush and map_kernel_range
> off entirely in favor of the vmap versions.
I can do a cleanup patch on top of it.
>> + for (i = 0; i < area->nr_pages; i += 1U << area->page_order) {
>
> Maybe using a helper that takes the vm_area_struct and either returns
> area->page_order or always 0 based on IS_ENABLED?
I'll see how it looks.
Thanks,
Nick
Powered by blists - more mailing lists