[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AA03E6A.7070800@gmail.com>
Date: Fri, 04 Sep 2009 00:08:42 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Christoph Lameter <cl@...ux-foundation.org>
CC: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Pekka Enberg <penberg@...helsinki.fi>,
Zdenek Kabelac <zdenek.kabelac@...il.com>,
Patrick McHardy <kaber@...sh.net>, Robin Holt <holt@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Jesper Dangaard Brouer <hawk@...x.dk>,
Linux Netdev List <netdev@...r.kernel.org>,
Netfilter Developers <netfilter-devel@...r.kernel.org>
Subject: Re: [PATCH] slub: fix slab_pad_check()
Christoph Lameter a écrit :
> On Thu, 3 Sep 2009, Paul E. McKenney wrote:
>
>> 2. CPU 0 discovers that the slab cache can now be destroyed.
>>
>> It determines that there are no users, and has guaranteed
>> that there will be no future users. So it knows that it
>> can safely do kmem_cache_destroy().
>>
>> 3. In absence of rcu_barrier(), kmem_cache_destroy() would
>> immediately tear down the slab data structures.
>
> Of course. This has been discussed before.
>
> You need to ensure that no objects are in use before destroying a slab. In
> case of DESTROY_BY_RCU you must ensure that there are no potential
> readers. So use a suitable rcu barrier or something else like a
> synchronize_rcu...
>
>>> But going through the RCU period is pointless since no user of the cache
>>> remains.
>> Which is irrelevant. The outstanding RCU callback was posted by the
>> slab cache itself, -not- by the user of the slab cache.
>
> There will be no rcu callbacks generated at kmem_cache_destroy with the
> patch I posted.
>
>>> The dismantling does not need RCU since there are no operations on the
>>> objects in progress. So simply switch DESTROY_BY_RCU off for close.
>> Unless I am missing something, this patch re-introduces the bug that
>> the rcu_barrier() was added to prevent. So, in absence of a better
>> explanation of what I am missing:
>
> The "fix" was ill advised. Slab users must ensure that no objects are in
> use before destroying a slab. Only the slab users know how the objects
> are being used. The slab allocator itself cannot know how to ensure that
> there are no pending references. Putting a rcu_barrier in there creates an
> inconsistency in the operation of kmem_cache_destroy() and an expectation
> of functionality that the function cannot provide.
>
Problem is not _objects_ Christoph, but _slabs_, and your patch is not working.
Its true that when User calls kmem_cache_destroy(), all _objects_ were previously freed.
This is mandatory, with or withou SLAB_DESTROY_BY_RCU thing
Problem is that slub has some internal state, including some to-be-freed _slabs_,
that User have no control at all on it.
User cannot "know" slabs are freed, inuse, or whatever state in cache or call_rcu queues.
Face it, SLAB_DESTROY_BY_RCU is internal affair (to slub/slab/... allocators)
We absolutely need a rcu_barrier() somewhere, believe it or not. You can argue that it should
be done *before*, but it gives no speedup, only potential bugs.
Only case User should do its rcu_barrier() itself is if it knows some call_rcu() are pending
and are delaying _objects_ freeing (typical !SLAB_DESTROY_RCU usage in RCU algos).
I dont even understand why you care so much about kmem_cache_destroy(SLAB_DESTROY_BY_RCU),
given that almost nobody use it. We took almost one month to find out what the bug was in first
place...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists