[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110206235136.GA23658@linux.vnet.ibm.com>
Date: Sun, 6 Feb 2011 15:51:36 -0800
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Milton Miller <miltonm@....com>
Cc: Peter Zijlstra <peterz@...radead.org>, akpm@...ux-foundation.org,
Anton Blanchard <anton@...ba.org>,
xiaoguangrong@...fujitsu.com, mingo@...e.hu, jaxboe@...ionio.com,
npiggin@...il.com, JBeulich@...ell.com, efault@....de,
rusty@...tcorp.com.au, torvalds@...ux-foundation.org,
benh@...nel.crashing.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3 v2] call_function_many: fix list delete vs add race
On Tue, Feb 01, 2011 at 08:17:40PM -0800, Paul E. McKenney wrote:
> On Tue, Feb 01, 2011 at 02:00:26PM -0800, Milton Miller wrote:
> > On Tue, 1 Feb 2011 about 14:00:26 -0800, "Paul E. McKenney" wrote:
> > > On Tue, Feb 01, 2011 at 01:12:18AM -0600, Milton Miller wrote:
[ . . . ]
> > > o If the bit is set, then we need to process this callback.
> > > IRQs are disabled, so we cannot race with ourselves
> > > -- our bit will remain set until we clear it.
> > > The list_add_rcu() in smp_call_function_many()
> > > in conjunction with the list_for_each_entry_rcu()
> > > in generic_smp_call_function_interrupt() guarantees
> > > that all of the field except for ->refs will be seen as
> > > initialized in the common case where we are looking at
> > > an callback that has just been enqueued.
> > >
> > > In the uncommon case where we picked up the pointer
> > > in list_for_each_entry_rcu() just before the last
> > > CPU removed the callback and when someone else
> > > immediately recycled it, all bets are off. We must
> > > ensure that we see all initialization via some other
> > > means.
> > >
> > > OK, so where is the memory barrier that pairs with the
> > > smp_rmb() between the ->cpumask and ->refs checks?
> > > It must be before the assignment to ->cpumask. One
> > > candidate is the smp_mb() in csd_lock(), but that does
> > > not make much sense. What we need to do is to ensure
> > > that if we see our bit in ->cpumask, that we also see
> > > the atomic decrement that previously zeroed ->refs.
> >
> > We have a full mb in csd_unlock on the cpu that zeroed refs and a full
> > mb in csd_lock on the cpu that sets mask and later refs.
> >
> > We rely on the atomic returns to order the two atomics, and the
> > atomic_dec_return to establish a single cpu as the last. After
> > that atomic is performed we do a full mb in unlock. At this
> > point all cpus must have visibility to all this prior processing.
> > On the owning cpu we then do a full mb in lock.
> >
> > How can any of the second party writes after the paired mb in lock be
> > visible and not all of the prior third party writes?
>
> Because smp_rmb() is not required to order prior writes against
> subsequent reads. The prior third-party writes are writes, right?
> When you want transitivity (observing n-th party writes that
> n-1-th party observed before n-1-th party's memory barrier), then
> you need a full memory barrier -- smp_mb().
FYI, for an example showing the need for smp_mb() to gain transitivity,
please see the following:
o http://paulmck.livejournal.com/20061.html
o http://paulmck.livejournal.com/20312.html
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists