[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061203234638.GA506@oleg>
Date: Mon, 4 Dec 2006 02:46:38 +0300
From: Oleg Nesterov <oleg@...sign.ru>
To: Eric Dumazet <dada1@...mosbay.com>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Dipankar Sarma <dipankar@...ibm.com>,
Andrew Morton <akpm@...l.org>,
Alan Stern <stern@...land.harvard.edu>,
linux-kernel@...r.kernel.org
Subject: Re: PATCH? rcu_do_batch: fix a pure theoretical memory ordering race
On 12/04, Eric Dumazet wrote:
>
> Oleg Nesterov a ?crit :
> >
> > int start_me_again;
> >
> > struct rcu_head rcu_head;
> >
> > void rcu_func(struct rcu_head *rcu)
> > {
> > start_me_again = 1;
> > }
> >
> > // could be called on arbitrary CPU
> > void check_start_me_again(void)
> > {
> > static spinlock_t lock;
> >
> > spin_lock(lock);
> > if (start_me_again) {
> > start_me_again = 0;
> > call_rcu(&rcu_head, rcu_func);
> > }
> > spin_unlock(lock);
> > }
> >
> >I'd say this code is not buggy.
>
> Are you sure ? Can you prove it ? :)
Looks like you think differently :)
> I do think your rcu_func() misses some sync primitive, *after*
> start_me_again=1;
> You seem to rely on some undocumented side effect.
> Adding smp_rmb() before calling rcu_func() wont help.
I guess you mean that check_start_me_again() can miss start_me_again != 0 ?
Yes, of course, it should check the condition from time to time. We can even
do
start_me_again = 1;
wake_up(&start_me_again_wq);
, this is still unsafe.
> >>A smp_rmb() wont avoid all possible bugs...
> >
> >For example?
>
> A smp_rmb() wont avoid stores pending on this CPU to be committed to memory
> after another cpu takes the object for itself. Those stores could overwrite
> stores done by the other cpu as well.
Yes. But RCU core doesn't write to rcu_head (except call_rcu). Callback _owns_
rcu_head, it should be ok to use it in any way without fear to break RCU.
Of course, callback should take care of its own locking/ordering.
> So in theory you could design a buggy callback function even after your
> patch applied.
So. Do you claim that rcu_func() above is buggy?
> Any function that can transfer an object from CPU A scope to CPU B scope
> must take care of memory barrier by itself. The caller *cannot* possibly do
> the job, especially if it used an indirect call. However, in some cases it
> is possible some clever algos are doing the reverse, ie doing the memory
> barrier in the callers.
>
> Kernel is full of such constructs :
>
> for (ptr = head; ptr != NULL ; ptr = next) {
> next = ptr->next;
> some_subsys_delete(ptr);
> }
>
> And we dont need to add smp_rmb() before the call to some_subsys_delete(),
> it would be a nightmare, and would slow down modern cpus.
Agreed.
Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists