[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <36577539-bff6-476e-8d6b-ca20e3de2391@suse.cz>
Date: Tue, 26 Nov 2024 16:27:23 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Ryan Roberts <ryan.roberts@....com>,
Dave Kleikamp <dave.kleikamp@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>, Christoph Lameter <cl@...ux.com>,
David Rientjes <rientjes@...gle.com>, Hyeonggon Yoo <42.hyeyoo@...il.com>
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [RFC PATCH] mm/slab: Avoid build bug for calls to kmalloc with a
large constant
On 11/26/24 16:09, Vlastimil Babka wrote:
> On 11/26/24 15:53, Ryan Roberts wrote:
>> On 26/11/2024 12:36, Vlastimil Babka wrote:
>>> On 11/26/24 13:18, Ryan Roberts wrote:
>>>> On 14/11/2024 10:09, Vlastimil Babka wrote:
>>>>> On 11/1/24 21:16, Dave Kleikamp wrote:
>>>>>> When boot-time page size is enabled, the test against KMALLOC_MAX_CACHE_SIZE
>>>>>> is no longer optimized out with a constant size, so a build bug may
>>>>>> occur on a path that won't be reached.
>>>>>
>>>>> That's rather unfortunate, the __builtin_constant_p(size) part of
>>>>> kmalloc_noprof() really expects things to resolve at compile time and it
>>>>> would be better to keep it that way.
>>>>>
>>>>> I think it would be better if we based KMALLOC_MAX_CACHE_SIZE itself on
>>>>> PAGE_SHIFT_MAX and kept it constant, instead of introducing
>>>>> KMALLOC_SHIFT_HIGH_MAX only for some sanity checks.
>>>>>
>>>>> So if the kernel was built to support 4k to 64k, but booted as 4k, it would
>>>>> still create and use kmalloc caches up to 128k. SLUB should handle that fine
>>>>> (if not, please report it :)
>>>>
>>>> So when PAGE_SIZE_MAX=64K and PAGE_SIZE=4K, kmalloc will support up to 128K
>>>> whereas before it only supported up to 8K. I was trying to avoid that since I
>>>> assumed that would be costly in terms of extra memory allocated for those higher
>>>> order buckets that will never be used. But I have no idea how SLUB works in
>>>> practice. Perhaps memory for the cache is only lazily allocated so we won't see
>>>> an issue in practice?
>>>
>>> Yes the e.g. 128k slabs themselves will be lazily allocated. There will be
>>> some overhead with the management structures (struct kmem_cache etc) but
>>> much smaller.
>>> To be completely honest, some extra overhead might come to be when the slabs
>>> are allocated ans later the user frees those allocations. kmalloc_large()
>>> wwould return them immediately, while a regular kmem_cache will keep one or
>>> more per cpu for reuse. But if that becomes a visible problem we can tune
>>> those caches to discard slabs more aggressively.
>>
>> Sorry to keep pushing on this, now that I've actually looked at the code, I feel
>> I have a slightly better understanding:
>>
>> void *kmalloc_noprof(size_t size, gfp_t flags)
>> {
>> if (__builtin_constant_p(size) && size) {
>>
>> if (size > KMALLOC_MAX_CACHE_SIZE)
>> return __kmalloc_large_noprof(size, flags); <<< (1)
>>
>> index = kmalloc_index(size);
>> return __kmalloc_cache_noprof(...); <<< (2)
>> }
>> return __kmalloc_noprof(size, flags); <<< (3)
>> }
>>
>> So if size and KMALLOC_MAX_CACHE_SIZE are constant, we end up with this
>> resolving either to a call to (1) or (2), decided at compile time. If
>> KMALLOC_MAX_CACHE_SIZE is not constant, (1), (2) and the runtime conditional
>> need to be kept in the function.
>>
>> But intuatively, I would have guessed that given the choice between the overhead
>> of keeping that runtime conditional vs keeping per-cpu slab caches for extra
>> sizes between 16K and 128K, then the runtime conditional would be preferable. I
>> would guess that quite a bit of memory could get tied up in those caches?
>>
>> Why is your preference the opposite? What am I not understanding?
>
> +CC more slab people.
>
> So the above is an inline function, but constructed in a way that it should,
> without further inline code, become
> - a call to __kmalloc_large_noprof() for build-time constant size larger
> than KMALLOC_MAX_CACHE_SIZE
> - a call to __kmalloc_cache_noprof() for build-time constant size smaller
> than KMALLOC_MAX_CACHE_SIZE, where the cache is picked from an array with
> compile-time calculated index
> - call to __kmalloc_noprof() for non-constant sizes otherwise
>
> If KMALLOC_MAX_CACHE_SIZE stops being build-time constant, the sensible way
> to handle it would be to #ifdef or otherwise compile out away the whole "if
> __builtin_constant_p(size)" part and just call __kmalloc_noprof() always, so
> we don't blow the inline paths with a KMALLOC_MAX_CACHE_SIZE check leading
> to choice between calling __kmalloc_large_noprof() or __kmalloc_cache_noprof().
Or maybe we could have PAGE_SIZE_MAX derived KMALLOC_MAX_CACHE_SIZE_MAX
behave as the code above currently does with KMALLOC_MAX_CACHE_SIZE, and
additionally have PAGE_SIZE_MIN derived KMALLOC_MAX_CACHE_SIZE_MIN, where
build-time-constant size larger than KMALLOC_MAX_CACHE_SIZE_MIN (which is a
compile-time test) is redirected to __kmalloc_noprof() for a run-time test.
That seems like the optimum solution :)
> I just don't believe we would waste so much memory with caches the extra
> sizes for sizes between 16K and 128K, so would do that suggestion only if
> proven wrong. But I wouldn't mind it that much if you chose it right away.
> The solution earlier in this thread to patch __kmalloc_index() would be
> worse than either of those two alternatives though.
Powered by blists - more mailing lists