linux-kernel - Re: [PATCH 2/2] mm/vmalloc: Add attempt_larger_order

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <31e0d55f-b4a3-4b4c-8018-82d76c429d7b@arm.com>
Date: Wed, 24 Dec 2025 12:05:49 +0530
From: Dev Jain <dev.jain@....com>
To: Ryan Roberts <ryan.roberts@....com>, Uladzislau Rezki <urezki@...il.com>
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
 Vishal Moola <vishal.moola@...il.com>, Baoquan He <bhe@...hat.com>,
 LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] mm/vmalloc: Add attempt_larger_order_alloc parameter


On 18/12/25 5:23 pm, Ryan Roberts wrote:
> On 18/12/2025 04:55, Dev Jain wrote:
>> On 17/12/25 8:50 pm, Ryan Roberts wrote:
>>> On 17/12/2025 12:02, Uladzislau Rezki wrote:
>>>>> On 16/12/2025 21:19, Uladzislau Rezki (Sony) wrote:
>>>>>> Introduce a module parameter to enable or disable the large-order
>>>>>> allocation path in vmalloc. High-order allocations are disabled by
>>>>>> default so far, but users may explicitly enable them at runtime if
>>>>>> desired.
>>>>>>
>>>>>> High-order pages allocated for vmalloc are immediately split into
>>>>>> order-0 pages and later freed as order-0, which means they do not
>>>>>> feed the per-CPU page caches. As a result, high-order attempts tend
>>>>>> to bypass the PCP fastpath and fall back to the buddy allocator that
>>>>>> can affect performance.
>>>>>>
>>>>>> However, when the PCP caches are empty, high-order allocations may
>>>>>> show better performance characteristics especially for larger
>>>>>> allocation requests.
>>>>> I wonder if a better solution would be "allocate order-0 if available in pcp,
>>>>> else try large order, else fallback to order-0" Could that provide the best of
>>>>> all worlds without needing a configuration knob?
>>>>>
>>>> I am not sure, to me it looks like a bit odd. 
>>> Perhaps it would feel better if it was generalized to "first try allocation from
>>> PCP list, highest to lowest order, then try allocation from the buddy, highest
>>> to lowest order"?
>>>
>>>> Ideally it would be
>>>> good just free it as high-order page and not order-0 peaces.
>>> Yeah perhaps that's better. How about something like this (very lightly tested
>>> and no performance results yet):
>>>
>>> (And I should admit I'm not 100% sure it is safe to call free_frozen_pages()
>>> with a contiguous run of order-0 pages, but I'm not seeing any warnings or
>>> memory leaks when running mm selftests...)
>> Wow I wasn't aware that we can do this. I see that free_hotplug_page_range() in
>> arm64/mmu.c already does this - it computes order from size and passes it to
>> __free_pages().
> Hmm that looks dodgy to me. But I'm not sure I actually understand what is going
> on...

I think this is fine. This function frees either the altmap (in which no struct page is
freed), or the array of struct pages in the vmemmap:

free_map_bootmem -> vmmemap_free (altmap=NULL) -> unmap_hotplug_range(free_mapped=true, altmap=NULL) -> ultimately __free_pages.

free_map_bootmem is called from section_deactivate, and takes in a virtual address corresponding to the vmemmap struct pages.
This virtual address is retrieved from sparse_decode_mem_map (note that the return value of this function is misleading).