[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zmr-KPG9F6w-uzys@zx2c4.com>
Date: Thu, 13 Jun 2024 16:11:52 +0200
From: "Jason A. Donenfeld" <Jason@...c4.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Jakub Kicinski <kuba@...nel.org>, Julia Lawall <Julia.Lawall@...ia.fr>,
linux-block@...r.kernel.org, kernel-janitors@...r.kernel.org,
bridge@...ts.linux.dev, linux-trace-kernel@...r.kernel.org,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
kvm@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
"Naveen N. Rao" <naveen.n.rao@...ux.ibm.com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Nicholas Piggin <npiggin@...il.com>, netdev@...r.kernel.org,
wireguard@...ts.zx2c4.com, linux-kernel@...r.kernel.org,
ecryptfs@...r.kernel.org, Neil Brown <neilb@...e.de>,
Olga Kornievskaia <kolga@...app.com>, Dai Ngo <Dai.Ngo@...cle.com>,
Tom Talpey <tom@...pey.com>, linux-nfs@...r.kernel.org,
linux-can@...r.kernel.org, Lai Jiangshan <jiangshanlai@...il.com>,
netfilter-devel@...r.kernel.org, coreteam@...filter.org,
Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH 00/14] replace call_rcu by kfree_rcu for simple
kmem_cache_free callback
On Thu, Jun 13, 2024 at 05:46:11AM -0700, Paul E. McKenney wrote:
> How about a kmem_cache_destroy_rcu() that marks that specified cache
> for destruction, and then a kmem_cache_destroy_barrier() that waits?
>
> I took the liberty of adding your name to the Google document [1] and
> adding this section:
Cool, though no need to make me yellow!
> > But then, if that mechanism generally works, we don't really need a new
> > function and we can just go with the first option of making
> > kmem_cache_destroy() asynchronously wait. It'll wait, as you described,
> > but then we adjust the tail of every kfree_rcu batch freeing cycle to
> > check if there are _still_ any old outstanding kmem_cache_destroy()
> > requests. If so, then we can splat and keep the old debugging info we
> > currently have for finding memleaks.
>
> The mechanism can always be sabotaged by memory-leak bugs on the part
> of the user of the kmem_cache structure in play, right?
>
> OK, but I see your point. I added this to the existing
> "kmem_cache_destroy() Lingers for kfree_rcu()" section:
>
> One way of preserving this debugging information is to splat if
> all of the slab’s memory has not been freed within a reasonable
> timeframe, perhaps the same 21 seconds that causes an RCU CPU
> stall warning.
>
> Does that capture it?
Not quite what I was thinking. Your 21 seconds as a time-based thing I
guess could be fine. But I was mostly thinking:
1) kmem_cache_destroy() is called, but there are outstanding objects, so
it defers.
2) Sometime later, a kfree_rcu_work batch freeing operation runs.
3) At the end of this batch freeing, the kernel notices that the
kmem_cache whose destruction was previously deferred still has
outstanding objects and has not been destroyed. It can conclude that
there's thus been a memory leak.
In other words, instead of having to do this based on timers, you can
just have the batch freeing code ask, "did those pending kmem_cache
destructions get completed as a result of this last operation?"
Powered by blists - more mailing lists