linux-kernel - Re: [PATCH 1/3 v2] call_function_many: fix list delete vs add race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110206235136.GA23658@linux.vnet.ibm.com>
Date:	Sun, 6 Feb 2011 15:51:36 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Milton Miller <miltonm@....com>
Cc:	Peter Zijlstra <peterz@...radead.org>, akpm@...ux-foundation.org,
	Anton Blanchard <anton@...ba.org>,
	xiaoguangrong@...fujitsu.com, mingo@...e.hu, jaxboe@...ionio.com,
	npiggin@...il.com, JBeulich@...ell.com, efault@....de,
	rusty@...tcorp.com.au, torvalds@...ux-foundation.org,
	benh@...nel.crashing.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3 v2] call_function_many: fix list delete vs add race

On Tue, Feb 01, 2011 at 08:17:40PM -0800, Paul E. McKenney wrote:
> On Tue, Feb 01, 2011 at 02:00:26PM -0800, Milton Miller wrote:
> > On Tue, 1 Feb 2011 about 14:00:26 -0800, "Paul E. McKenney" wrote:
> > > On Tue, Feb 01, 2011 at 01:12:18AM -0600, Milton Miller wrote:

[ . . . ]

> > > 	o	If the bit is set, then we need to process this callback.
> > > 		IRQs are disabled, so we cannot race with ourselves
> > > 		-- our bit will remain set until we clear it.
> > > 		The list_add_rcu() in smp_call_function_many()
> > > 		in conjunction with the list_for_each_entry_rcu()
> > > 		in generic_smp_call_function_interrupt() guarantees
> > > 		that all of the field except for ->refs will be seen as
> > > 		initialized in the common case where we are looking at
> > > 		an callback that has just been enqueued.
> > > 
> > > 		In the uncommon case where we picked up the pointer
> > > 		in list_for_each_entry_rcu() just before the last
> > > 		CPU removed the callback and when someone else
> > > 		immediately recycled it, all bets are off.  We must
> > > 		ensure that we see all initialization via some other
> > > 		means.
> > > 
> > > 		OK, so where is the memory barrier that pairs with the
> > > 		smp_rmb() between the ->cpumask and ->refs checks?
> > > 		It must be before the assignment to ->cpumask.  One
> > > 		candidate is the smp_mb() in csd_lock(), but that does
> > > 		not make much sense.  What we need to do is to ensure
> > > 		that if we see our bit in ->cpumask, that we also see
> > > 		the atomic decrement that previously zeroed ->refs.
> > 
> > We have a full mb in csd_unlock on the cpu that zeroed refs and a full
> > mb in csd_lock on the cpu that sets mask and later refs.
> > 
> > We rely on the atomic returns to order the two atomics, and the 
> > atomic_dec_return to establish a single cpu as the last.  After
> > that atomic is performed we do a full mb in unlock.   At this
> > point all cpus must have visibility to all this prior processing.
> > On the owning cpu we then do a full mb in lock.
> > 
> > How can any of the second party writes after the paired mb in lock be
> > visible and not all of the prior third party writes?
> 
> Because smp_rmb() is not required to order prior writes against
> subsequent reads.  The prior third-party writes are writes, right?
> When you want transitivity (observing n-th party writes that
> n-1-th party observed before n-1-th party's memory barrier), then
> you need a full memory barrier -- smp_mb().

FYI, for an example showing the need for smp_mb() to gain transitivity,
please see the following:

o	http://paulmck.livejournal.com/20061.html
o	http://paulmck.livejournal.com/20312.html

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/