lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 1 Feb 2011 11:39:13 +1000
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Milton Miller <miltonm@....com>, akpm@...ux-foundation.org,
	Anton Blanchard <anton@...ba.org>,
	xiaoguangrong@...fujitsu.com, mingo@...e.hu, jaxboe@...ionio.com,
	npiggin@...il.com, rusty@...tcorp.com.au,
	paulmck@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org
Subject: Re: call_function_many: fix list delete vs add race

On Tue, Feb 1, 2011 at 10:22 AM, Benjamin Herrenschmidt
<benh@...nel.crashing.org> wrote:
> On Mon, 2011-01-31 at 22:17 +0100, Peter Zijlstra wrote:
>> That's wrong:
>>
>>  ->foo =
>>  LOCK
>>  UNLOCK
>>  ->bar =
>>
>> can be re-ordered as:
>>
>>  LOCK
>>  ->bar =
>>  ->foo =
>>  UNLOCK
>
> Can it ? I though UNLOCK had a write barrier semantic ?

The re-ordering of ->foo and UNLOCK that Peter claims is definitely
possible, and the unlock is not guaranteed to be a write barrier. The
unlock just guarantees "release" consistency, which means that no
previous reads or writes will be migrated down from within the locked
region, but it is very much a one-way barrier, so a subsequent write -
or more commonly a read - could easily migrate up past the unlock.

(Same for lock, except the permeability is obviously the other way
around - "acquire" consistency)

> So yes, ->bar = can leak into the lock, as can ->foo =, but they can't
> be re-ordered vs. each other because the implied barrier will keep ->foo
> = in the same "domain" as the unlock itself.
>
> Or do other archs do something really nasty here that don't provide this
> guarantee ?

I think we actually allow the accesses to ->bar and ->foo to be
re-ordered wrt each other, exactly because they can *both* get
re-ordered first into the locked region, and then after that they can
get re-ordered wrt each other (because there is no other memory
barrier).

So a "unlock+lock" is guaranteed to be equivalent to a full memory
barrier (because an operation before the unlock cannot pass the
unlock, and an access after the lock cannot percolate up before it).
But the regular "lock+unlock" sequence is not, exactly because
accesses outside of it are allowed to first leak in, and then not have
ordering constraints within the locked region.

That said, we may have some confusion there, and I would *STRONGLY*
suggest that architectures should have stronger lock consistency
guarantees than the theoretical ones. Especially since x86 has such
strong rules, and locking is a full memory barrier. Anybody with very
weak ordering is likely to just hit more bugs.

And so while I'm not sure it's ever been documented, I do think it is
likely a good idea to just make sure that "lock+unlock" is a full
memory barrier, the same way "unlock+lock" is. I think it's
practically true anyway on all architectures (with the possible
exception of ia64, which I think actually implements real
acquire/release semantics)

(Practically speaking, there really aren't many reasons to allow
writes to be re-ordered wrt lock/unlock operations, and there is no
reason to ever move reads later, only earlier. Writes are _not_
performance-sensitive the way reads are, so there is no real reason to
really take advantage of the one-way permeability for them. It's reads
that you really want to speculate and do early, not writes, and so
allowing a read after a UNLOCK to percolate into the critical region
is really the only relevant case from a performance perspective).

                    Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ