[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAs8eWHw5ELbmSZt@harry>
Date: Fri, 25 Apr 2025 16:40:41 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: Pedro Falcato <pfalcato@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
Christoph Lameter <cl@...two.org>,
David Rientjes <rientjes@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Dennis Zhou <dennis@...nel.org>, Tejun Heo <tj@...nel.org>,
Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>, Jiri Pirko <jiri@...nulli.us>,
Vlad Buslov <vladbu@...dia.com>,
Yevgeny Kliteynik <kliteyn@...dia.com>, Jan Kara <jack@...e.cz>,
Byungchul Park <byungchul@...com>, linux-mm@...ck.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/7] Reviving the slab destructor to tackle the
percpu allocator scalability problem
On Thu, Apr 24, 2025 at 05:20:59PM +0200, Mateusz Guzik wrote:
> On Thu, Apr 24, 2025 at 1:28 PM Pedro Falcato <pfalcato@...e.de> wrote:
> > > How to do this with slab constructors and destructors: the constructor
> > > allocates percpu memory, and the destructor frees it when the slab pages
> > > are reclaimed; this slightly alters the constructor’s semantics,
> > > as it can now fail.
> >
> > I really really really really don't like this. We're opening a pandora's box
> > of locking issues for slab deadlocks and other subtle issues. IMO the best
> > solution there would be, what, failing dtors? which says a lot about the whole
> > situation...
> >
>
> I noted the need to use leaf spin locks in my IRC conversations with
> Harry and later in this very thread, it is a bummer this bit did not
> make into the cover letter -- hopefully it would have avoided this
> exchange.
My bad. Yes, I should have included it in the cover letter.
Will include in the future series.
> I'm going to summarize this again here:
>
> By API contract the dtor can only take a leaf spinlock, in this case one which:
> 1. disables irqs
Alright, as interrupt handlers can also introduce lock dependency.
> 2. is the last lock in the dependency chain, as in no locks are taken
> while holding it
So if the destructor takes a lock, no users of the lock, in any
dependency chain, can hold any locks while holding it.
> That way there is no possibility of a deadlock.
>
> This poses a question on how to enforce it and this bit is easy: for
> example one can add leaf-spinlock notion to lockdep. Then a misuse on
> allocation side is going to complain immediately even without
> triggering reclaim. Further, if one would feel so inclined, a test
> module can walk the list of all slab caches and do a populate/reclaim
> cycle on those with the ctor/dtor pair.
>
> Then there is the matter of particular consumers being ready to do
> what they need to on the dtor side only with the spinlock. Does not
> sound like a fundamental problem.
>
> > Case in point:
> > What happens if you allocate a slab and start ->ctor()-ing objects, and then
> > one of the ctors fails? We need to free the ctor, but not without ->dtor()-ing
> > everything back (AIUI this is not handled in this series, yet).
It is handled in __free_slab().
>> Besides this
> > complication, if failing dtors were added into the mix, we'd be left with a
> > half-initialized slab(!!) in the middle of the cache waiting to get freed,
> > without being able to.
> >
>
> Per my previous paragraph failing dtors would be a self-induced problem.
>
> I can agree one has to roll things back if ctors don't work out, but I
> don't think this poses a significant problem.
Agreed. it'd better to write destructors carefully avoiding deadlocks
rather than allowing it to fail.
--
Cheers,
Harry / Hyeonggon
Powered by blists - more mailing lists