linux-kernel - Re: call_function_many: fix list delete vs add race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110201021831.GB2158@linux.vnet.ibm.com>
Date:	Mon, 31 Jan 2011 18:18:31 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Milton Miller <miltonm@....com>, akpm@...ux-foundation.org,
	Anton Blanchard <anton@...ba.org>,
	xiaoguangrong@...fujitsu.com, mingo@...e.hu, jaxboe@...ionio.com,
	npiggin@...il.com, rusty@...tcorp.com.au,
	linux-kernel@...r.kernel.org
Subject: Re: call_function_many: fix list delete vs add race

On Tue, Feb 01, 2011 at 11:39:13AM +1000, Linus Torvalds wrote:
> On Tue, Feb 1, 2011 at 10:22 AM, Benjamin Herrenschmidt
> <benh@...nel.crashing.org> wrote:
> > On Mon, 2011-01-31 at 22:17 +0100, Peter Zijlstra wrote:
> >> That's wrong:
> >>
> >>  ->foo =
> >>  LOCK
> >>  UNLOCK
> >>  ->bar =
> >>
> >> can be re-ordered as:
> >>
> >>  LOCK
> >>  ->bar =
> >>  ->foo =
> >>  UNLOCK
> >
> > Can it ? I though UNLOCK had a write barrier semantic ?
> 
> The re-ordering of ->foo and UNLOCK that Peter claims is definitely
> possible, and the unlock is not guaranteed to be a write barrier. The
> unlock just guarantees "release" consistency, which means that no
> previous reads or writes will be migrated down from within the locked
> region, but it is very much a one-way barrier, so a subsequent write -
> or more commonly a read - could easily migrate up past the unlock.
> 
> (Same for lock, except the permeability is obviously the other way
> around - "acquire" consistency)
> 
> > So yes, ->bar = can leak into the lock, as can ->foo =, but they can't
> > be re-ordered vs. each other because the implied barrier will keep ->foo
> > = in the same "domain" as the unlock itself.
> >
> > Or do other archs do something really nasty here that don't provide this
> > guarantee ?
> 
> I think we actually allow the accesses to ->bar and ->foo to be
> re-ordered wrt each other, exactly because they can *both* get
> re-ordered first into the locked region, and then after that they can
> get re-ordered wrt each other (because there is no other memory
> barrier).
> 
> So a "unlock+lock" is guaranteed to be equivalent to a full memory
> barrier (because an operation before the unlock cannot pass the
> unlock, and an access after the lock cannot percolate up before it).
> But the regular "lock+unlock" sequence is not, exactly because
> accesses outside of it are allowed to first leak in, and then not have
> ordering constraints within the locked region.
> 
> That said, we may have some confusion there, and I would *STRONGLY*
> suggest that architectures should have stronger lock consistency
> guarantees than the theoretical ones. Especially since x86 has such
> strong rules, and locking is a full memory barrier. Anybody with very
> weak ordering is likely to just hit more bugs.
> 
> And so while I'm not sure it's ever been documented, I do think it is
> likely a good idea to just make sure that "lock+unlock" is a full
> memory barrier, the same way "unlock+lock" is. I think it's
> practically true anyway on all architectures (with the possible
> exception of ia64, which I think actually implements real
> acquire/release semantics)

Documentation/memory-barriers.txt specifies this.

	Therefore, from (1), (2) and (4) an UNLOCK followed by an
	unconditional LOCK is equivalent to a full barrier, but a LOCK
	followed by an UNLOCK is not.

							Thanx, Paul

> (Practically speaking, there really aren't many reasons to allow
> writes to be re-ordered wrt lock/unlock operations, and there is no
> reason to ever move reads later, only earlier. Writes are _not_
> performance-sensitive the way reads are, so there is no real reason to
> really take advantage of the one-way permeability for them. It's reads
> that you really want to speculate and do early, not writes, and so
> allowing a read after a UNLOCK to percolate into the critical region
> is really the only relevant case from a performance perspective).
> 
>                     Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/