[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <80208a6c-ec42-6260-5f6f-b3c5c2788fcd@gentwo.org>
Date: Thu, 24 Apr 2025 08:50:04 -0700 (PDT)
From: "Christoph Lameter (Ampere)" <cl@...two.org>
To: Harry Yoo <harry.yoo@...cle.com>
cc: Vlastimil Babka <vbabka@...e.cz>, David Rientjes <rientjes@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>, Dennis Zhou <dennis@...nel.org>,
Tejun Heo <tj@...nel.org>, Mateusz Guzik <mjguzik@...il.com>,
Jamal Hadi Salim <jhs@...atatu.com>, Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>, Vlad Buslov <vladbu@...dia.com>,
Yevgeny Kliteynik <kliteyn@...dia.com>, Jan Kara <jack@...e.cz>,
Byungchul Park <byungchul@...com>, linux-mm@...ck.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/7] Reviving the slab destructor to tackle the percpu
allocator scalability problem
On Thu, 24 Apr 2025, Harry Yoo wrote:
> Consider mm_struct: it allocates two percpu regions (mm_cid and rss_stat),
> so each allocate–free cycle requires two expensive acquire/release on
> that mutex.
> We can mitigate this contention by retaining the percpu regions after
> the object is freed and releasing them only when the backing slab pages
> are freed.
Could you keep a cache of recently used per cpu regions so that you can
avoid frequent percpu allocation operation?
You could allocate larger percpu areas for a batch of them and
then assign as needed.
Powered by blists - more mailing lists