lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 7 Jun 2016 16:59:02 +0200
From:	Hannes Frederic Sowa <hannes@...essinduktion.org>
To:	paulmck@...ux.vnet.ibm.com
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Will Deacon <will.deacon@....com>,
	Vineet Gupta <Vineet.Gupta1@...opsys.com>,
	Waiman Long <waiman.long@....com>,
	linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
	manfred@...orfullife.com, dave@...olabs.net, boqun.feng@...il.com,
	tj@...nel.org, pablo@...filter.org, kaber@...sh.net,
	davem@...emloft.net, oleg@...hat.com,
	netfilter-devel@...r.kernel.org, sasha.levin@...cle.com,
	hofrat@...dl.org
Subject: Re: [RFC][PATCH 1/3] locking: Introduce smp_acquire__after_ctrl_dep

On 07.06.2016 15:06, Paul E. McKenney wrote:
> On Tue, Jun 07, 2016 at 02:41:44PM +0200, Hannes Frederic Sowa wrote:
>> On 07.06.2016 09:15, Peter Zijlstra wrote:
>>>>
>>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
>>>> index 147ae8ec836f..a4d0a99de04d 100644
>>>> --- a/Documentation/memory-barriers.txt
>>>> +++ b/Documentation/memory-barriers.txt
>>>> @@ -806,6 +806,41 @@ out-guess your code.  More generally, although READ_ONCE() does force
>>>>  the compiler to actually emit code for a given load, it does not force
>>>>  the compiler to use the results.
>>>>  
>>>> +In addition, control dependencies apply only to the then-clause and
>>>> +else-clause of the if-statement in question.  In particular, it does
>>>> +not necessarily apply to code following the if-statement:
>>>> +
>>>> +	q = READ_ONCE(a);
>>>> +	if (q) {
>>>> +		WRITE_ONCE(b, p);
>>>> +	} else {
>>>> +		WRITE_ONCE(b, r);
>>>> +	}
>>>> +	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
>>>> +
>>>> +It is tempting to argue that there in fact is ordering because the
>>>> +compiler cannot reorder volatile accesses and also cannot reorder
>>>> +the writes to "b" with the condition.  Unfortunately for this line
>>>> +of reasoning, the compiler might compile the two writes to "b" as
>>>> +conditional-move instructions, as in this fanciful pseudo-assembly
>>>> +language:
>>
>> I wonder if we already guarantee by kernel compiler settings that this
>> behavior is not allowed by at least gcc.
>>
>> We unconditionally set --param allow-store-data-races=0 which should
>> actually prevent gcc from generating such conditional stores.
>>
>> Am I seeing this correct here?
> 
> In this case, the store to "c" is unconditional, so pulling it forward
> would not generate a data race.  However, the compiler is still prohibited
> from pulling it forward because it is not allowed to reorder volatile
> references.  So, yes, the compiler cannot reorder, but for a different
> reason.
> 
> Some CPUs, on the other hand, can do this reordering, as Will Deacon
> pointed out earlier in this thread.

Sorry, to follow-up again on this. Will Deacon's comments were about
conditional-move instructions, which this compiler-option would prevent,
as far as I can see it. Thus I couldn't follow your answer completely:

The writes to b would be non-conditional-moves with a control dependency
from a and and edge down to the write to c, which obviously is
non-conditional. As such in terms of dependency ordering, we would have
the control dependency always, thus couldn't we assume that in a current
kernel we always have a load(a)->store(c) requirement?

Is there something else than conditional move instructions that could
come to play here? Obviously a much smarter CPU could evaluate all the
jumps and come to the conclusion that the write to c is never depending
on the load from a, but is this implemented somewhere in hardware?

Thank you,
Hannes

Powered by blists - more mailing lists