[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1598025275.jd6s9py77x.astroid@bobo.none>
Date: Sat, 22 Aug 2020 02:05:35 +1000
From: Nicholas Piggin <npiggin@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>,
Eric Dumazet <eric.dumazet@...il.com>, linux-mm@...ck.org
Cc: Christophe Leroy <christophe.leroy@...roup.eu>,
Christoph Hellwig <hch@...radead.org>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, Zefan Li <lizefan@...wei.com>
Subject: Re: [PATCH v6 11/12] mm/vmalloc: Hugepage vmalloc mappings
Excerpts from Eric Dumazet's message of August 22, 2020 1:38 am:
>
> On 8/21/20 8:12 AM, Nicholas Piggin wrote:
>> Support huge page vmalloc mappings. Config option HAVE_ARCH_HUGE_VMALLOC
>> enables support on architectures that define HAVE_ARCH_HUGE_VMAP and
>> supports PMD sized vmap mappings.
>>
>> vmalloc will attempt to allocate PMD-sized pages if allocating PMD size or
>> larger, and fall back to small pages if that was unsuccessful.
>>
>> Allocations that do not use PAGE_KERNEL prot are not permitted to use huge
>> pages, because not all callers expect this (e.g., module allocations vs
>> strict module rwx).
>>
>> This reduces TLB misses by nearly 30x on a `git diff` workload on a 2-node
>> POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%.
>>
>> This can result in more internal fragmentation and memory overhead for a
>> given allocation, an option nohugevmalloc is added to disable at boot.
>>
>>
>
> Thanks for working on this stuff, I tried something similar in the past,
> but could not really do more than a hack.
> ( https://lkml.org/lkml/2016/12/21/285 )
Oh nice. It might be possible to do some ideas from your patch
still. Higher order pages smaller than PMD size, or the memory
policy stuff, perhaps.
> Note that __init alloc_large_system_hash() is used at boot time,
> when NUMA policy is spreading allocations over all NUMA nodes.
>
> This means that on a dual node system, a hash table should be 50/50 spread.
>
> With your patch, if a hashtable is exactly the size of one huge page,
> the location of this hashtable will be not balanced, this might have some
> unwanted impact.
In that case it shouldn't because it divides by the number of nodes,
but it will in general have a bit larger granularity in balancing than
smaller pages of course.
There's probably a better way to size these important hashes on NUMA. I
suspect most of the time you have a NUMA machine you actually would
prefer to use large pages now, even if it means taking up to 2MB more
memory per node per hash. It's not a great amount and the allocation
size is rather arbitrary anyway.
Thanks,
Nick
Powered by blists - more mailing lists