linux-kernel - Re: [PATCH v6 0/7] Buddy allocator like (or non-uniform) folio split

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <019EB6CA-0F4B-496A-B2AE-A3A553585281@nvidia.com>
Date: Fri, 07 Feb 2025 09:11:39 -0500
From: Zi Yan <ziy@...dia.com>
To: Andrew Morton <akpm@...ux-foundation.org>,
 "Matthew Wilcox (Oracle)" <willy@...radead.org>
Cc: linux-mm@...ck.org,
 "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
 Ryan Roberts <ryan.roberts@....com>, Hugh Dickins <hughd@...gle.com>,
 David Hildenbrand <david@...hat.com>, Yang Shi <yang@...amperecomputing.com>,
 Miaohe Lin <linmiaohe@...wei.com>, Kefeng Wang <wangkefeng.wang@...wei.com>,
 Yu Zhao <yuzhao@...gle.com>, John Hubbard <jhubbard@...dia.com>,
 Baolin Wang <baolin.wang@...ux.alibaba.com>, linux-kselftest@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 0/7] Buddy allocator like (or non-uniform) folio split

On 6 Feb 2025, at 3:01, Andrew Morton wrote:

> On Tue,  4 Feb 2025 22:14:10 -0500 Zi Yan <ziy@...dia.com> wrote:
>
>> This patchset adds a new buddy allocator like (or non-uniform) large folio
>> split to reduce the total number of after-split folios, the amount of memory
>> needed for multi-index xarray split, and keep more large folios after a split.
>
> It would be useful (vital, really) to provide some measurements which
> help others understand the magnitude of these resource savings, please.

Hi Andrew,

Can you please drop this series for now? I find that, after your above request,
I misunderstood how xas_split_alloc() and xas_split() works in xarray, thus,
my current implementation allocates more than enough xa_node during non-uniform
split, although the excessive ones are freed at the end. It defeats the purpose
of reducing memory consumption of multi-index xarray split, even if
folio_split() has no function issue AFAICT. I am working on a better
implementation that might require new xarray operations. I will post it as v7
later. I really appreciate that you asked about more info above. :)

More details on memory saving for multi-index xarray split during non-uniform
split compared to existing uniform split (I will add this to commit log in the
next version):

Existing uniform split requires 2^(order % XA_CHUNK_SHIFT) xa_node allocations
during split, when the folio needs to be split to order-0. But non-uniform split
only requires at most 1 xa_node allocation. For example, to split an order-9
folio, 8 xa_nodes are needed for uniform split, since the folio takes 8
multi-index slots in the xarray. But for non-uniform split, only the slot
containing the given struct page needs a xa_node after the split. There will be
a 7 xa_node saving.

Hi Matthew,

Do you mind checking my statement above on xarray memory saving? And correct me
if I miss anything. Thanks.

Best Regards,
Yan, Zi