lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 24 Oct 2015 19:53:56 +0800
From:	Boqun Feng <boqun.feng@...il.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
	Ingo Molnar <mingo@...nel.org>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Paul Mackerras <paulus@...ba.org>,
	Michael Ellerman <mpe@...erman.id.au>,
	Thomas Gleixner <tglx@...utronix.de>,
	Will Deacon <will.deacon@....com>,
	Waiman Long <waiman.long@...com>,
	Davidlohr Bueso <dave@...olabs.net>, stable@...r.kernel.org
Subject: Re: [PATCH tip/locking/core v4 1/6] powerpc: atomic: Make *xchg and
 *cmpxchg a full barrier

On Sat, Oct 24, 2015 at 12:26:27PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 22, 2015 at 08:07:16PM +0800, Boqun Feng wrote:
> > On Wed, Oct 21, 2015 at 09:48:25PM +0200, Peter Zijlstra wrote:
> > > On Wed, Oct 21, 2015 at 12:35:23PM -0700, Paul E. McKenney wrote:
> > > > > > > > I ask this because I recall Peter once bought up a discussion:
> > > > > > > > 
> > > > > > > > https://lkml.org/lkml/2015/8/26/596
> > > 
> > > > > So a full barrier on one side of these operations is enough, I think.
> > > > > IOW, there is no need to strengthen these operations.
> > > > 
> > > > Do we need to also worry about other futex use cases?
> > > 
> > > Worry, always!
> > > 
> > > But yes, there is one more specific usecase, which is that of a
> > > condition variable.
> > > 
> > > When we go sleep on a futex, we might want to assume visibility of the
> > > stores done by the thread that woke us by the time we wake up.
> > > 
> > 
> > But the thing is futex atomics in PPC are already RELEASE(pc)+ACQUIRE
> > and imply a full barrier, is an RELEASE(sc) semantics really needed
> > here?
> 
> For this, no, the current code should be fine I think.
> 
> > Further more, is this condition variable visibility guaranteed by other
> > part of futex? Because in futex_wake_op:
> > 
> > 	futex_wake_op()
> > 	  ...
> > 	  double_unlock_hb(hb1, hb2);  <- RELEASE(pc) barrier here.
> > 	  wake_up_q(&wake_q);
> > 
> > and in futex_wait():
> > 
> > 	futex_wait()
> > 	  ...
> > 	  futex_wait_queue_me(hb, &q, to); <- schedule() here
> > 	  ...
> > 	  unqueue_me(&q)
> > 	    drop_futex_key_refs(&q->key);
> > 	    	iput()/mmdrop(); <- a full barrier
> > 	  
> > 
> > The RELEASE(pc) barrier pairs with the full barrier, therefore the
> > userspace wakee can observe the condition variable modification.
> 
> Right, futexes are a pain; and I think we all agreed we didn't want to
> go rely on implementation details unless we absolutely _have_ to.
> 

Agreed.

Besides, after I have read why futex_wake_op(the caller of
futex_atomic_op_inuser()) is introduced, I think your worries are quite
reasonable. I thought the futex_atomic_op_inuser() only operated on
futex related variables, but it turns out it can actually operate any
userspace variable if userspace code likes, therefore we don't have
control of all memory ordering guarantee of the variable. So if PPC
doesn't provide a full barrier at user<->kernel boundries, we should
make futex_atomic_op_inuser() fully ordered.


Still looking into futex_atomic_cmpxchg_inatomic() ...

> > > And.. aside from the thoughts I outlined in the email referenced above,
> > > there is always the chance people accidentally rely on the strong
> > > ordering on their x86 CPU and find things come apart when ran on their
> > > ARM/MIPS/etc..
> > > 
> > > There are a fair number of people who use the raw futex call and we have
> > > 0 visibility into many of them. The assumed and accidental ordering
> > > guarantees will forever remain a mystery.
> > > 
> > 
> > Understood. That's truely a potential problem. Considering not all the
> > architectures imply a full barrier at user<->kernel boundries, maybe we
> > can use one bit in the opcode of the futex system call to indicate
> > whether userspace treats futex as fully ordered. Like:
> > 
> > #define FUTEX_ORDER_SEQ_CST  0
> > #define FUTEX_ORDER_RELAXED  64 (bit 7 and bit 8 are already used)
> 
> Not unless there's an actual performance problem with any of this.
> Futexes are painful enough as is.

Make sense, and we still have choices like modifying the userspace code
if there is actually a performance problem ;-)

Regards,
Boqun

Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ