linux-kernel - Re: PATCH? rcu_do_batch: fix a pure theoretical memory ordering race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20061203234638.GA506@oleg>
Date:	Mon, 4 Dec 2006 02:46:38 +0300
From:	Oleg Nesterov <oleg@...sign.ru>
To:	Eric Dumazet <dada1@...mosbay.com>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Dipankar Sarma <dipankar@...ibm.com>,
	Andrew Morton <akpm@...l.org>,
	Alan Stern <stern@...land.harvard.edu>,
	linux-kernel@...r.kernel.org
Subject: Re: PATCH? rcu_do_batch: fix a pure theoretical memory ordering race

On 12/04, Eric Dumazet wrote:
>
> Oleg Nesterov a ?crit :
> >
> >	int start_me_again;
> >
> >	struct rcu_head rcu_head;
> >
> >	void rcu_func(struct rcu_head *rcu)
> >	{
> >		start_me_again = 1;
> >	}
> >
> >	// could be called on arbitrary CPU
> >	void check_start_me_again(void)
> >	{
> >		static spinlock_t lock;
> >
> >		spin_lock(lock);
> >		if (start_me_again) {
> >			start_me_again = 0;
> >			call_rcu(&rcu_head, rcu_func);
> >		}
> >		spin_unlock(lock);
> >	}
> >
> >I'd say this code is not buggy.
> 
> Are you sure ? Can you prove it ? :)

Looks like you think differently :)

> I do think your rcu_func() misses some sync primitive, *after* 
> start_me_again=1;
> You seem to rely on some undocumented side effect.
> Adding smp_rmb() before calling rcu_func() wont help.

I guess you mean that check_start_me_again() can miss start_me_again != 0 ?
Yes, of course, it should check the condition from time to time. We can even
do
	start_me_again = 1;
	wake_up(&start_me_again_wq);

, this is still unsafe.

> >>A smp_rmb() wont avoid all possible bugs...
> >
> >For example?
> 
> A smp_rmb() wont avoid stores pending on this CPU to be committed to memory 
> after another cpu takes the object for itself. Those stores could overwrite 
> stores done by the other cpu as well.

Yes. But RCU core doesn't write to rcu_head (except call_rcu). Callback _owns_
rcu_head, it should be ok to use it in any way without fear to break RCU.
Of course, callback should take care of its own locking/ordering.

> So in theory you could design a buggy callback function even after your 
> patch applied.

So. Do you claim that rcu_func() above is buggy?

> Any function that can transfer an object from CPU A scope to CPU B scope 
> must take care of memory barrier by itself. The caller *cannot* possibly do 
> the job, especially if it used an indirect call. However, in some cases it 
> is possible some clever algos are doing the reverse, ie doing the memory 
> barrier in the callers.
> 
> Kernel is full of such constructs :
> 
> for (ptr = head; ptr != NULL ; ptr = next) {
> 	next = ptr->next;
> 	some_subsys_delete(ptr);
> }
> 
> And we dont need to add smp_rmb() before the call to some_subsys_delete(), 
> it would be a nightmare, and would slow down modern cpus.

Agreed.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/