lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 7 Apr 2017 17:13:59 +0100
From:   Will Deacon <will.deacon@....com>
To:     Nicholas Piggin <npiggin@...il.com>
Cc:     David Miller <davem@...emloft.net>, torvalds@...ux-foundation.org,
        linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
        anton@...ba.org, linuxppc-dev@...abs.org, peterz@...radead.org
Subject: Re: [RFC][PATCH] spin loop arch primitives for busy waiting

On Fri, Apr 07, 2017 at 01:30:11AM +1000, Nicholas Piggin wrote:
> On Thu, 6 Apr 2017 15:13:53 +0100
> Will Deacon <will.deacon@....com> wrote:
> > On Thu, Apr 06, 2017 at 10:59:58AM +1000, Nicholas Piggin wrote:
> > > Thanks for taking a look. The default spin primitives should just
> > > continue to do the right thing for you in that case.
> > > 
> > > Arm has a yield instruction, ia64 has a pause... No unusual
> > > requirements that I can see.  
> > 
> > Yield tends to be implemented as a NOP in practice, since it's in the
> > architecture for SMT CPUs and most ARM CPUs are single-threaded. We do have
> > the WFE instruction (wait for event) which is used in our implementation of
> > smp_cond_load_acquire, but I don't think we'd be able to use it with the
> > proposals here.
> > 
> > WFE can stop the clock for the CPU until an "event" is signalled by
> > another CPU. This could be done by an explicit SEV (send event) instruction,
> > but that tends to require heavy barriers on the signalling side. Instead,
> > the preferred way to generate an event is to clear the exclusive monitor
> > reservation for the CPU executing the WFE. That means that the waiter
> > does something like:
> > 
> > 	LDXR x0, [some_address]	// Load exclusive from some_address
> > 	CMP  x0, some value	// If the value matches what I want
> > 	B.EQ out		// then we're done
> > 	WFE			// otherwise, wait
> > 
> > at this point, the waiter will stop on the WFE until its monitor is cleared,
> > which happens if another CPU writes to some_address.
> > 
> > We've wrapped this up in the arm64 code as __cmpwait, and we use that
> > to build smp_cond_load_acquire. It would be nice to use the same machinery
> > for the conditional spinning here, unless you anticipate that we're only
> > going to be spinning for a handful of iterations anyway?
> 
> So I do want to look at adding spin loop primitives as well as the
> begin/in/end primitives to help with powerpc's SMT priorities.
> 
> So we'd have:
> 
>   spin_begin();
>   spin_do {
>     if (blah) {
>         spin_end();
>         return;
>     }
>   } spin_until(!locked);
>   spin_end();
> 
> So you could implement your monitor with that. There's a handful of core
> places. mutex, bit spinlock, seqlock, polling idle, etc. So I think if it
> is beneficial for you in smp_cond_load_acquire, it should be useful in
> those too.

Yeah, I think we should be able to implement spin_until like we do for
smp_cond_load_acquir, although it means we need to pass in the pointer as
well.

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ