linux-kernel - Re: [RFC PATCH 0/3] restartable sequences v2: fast user-space percpu critical sections

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrUi+TXUD3PJqm9BqV+e9ozi7nJDp2u+gpUHBzPxe8Ub9A@mail.gmail.com>
Date:	Fri, 8 Apr 2016 08:58:27 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...hat.com>,
	Paul Turner <commonly@...il.com>, Chris Lameter <cl@...ux.com>,
	Andi Kleen <andi@...stfloor.org>,
	Josh Triplett <josh@...htriplett.org>,
	Dave Watson <davejwatson@...com>,
	Linux API <linux-api@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Andrew Hunter <ahh@...gle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [RFC PATCH 0/3] restartable sequences v2: fast user-space percpu
 critical sections

On Apr 7, 2016 11:41 PM, "Peter Zijlstra" <peterz@...radead.org> wrote:
>
> On Thu, Apr 07, 2016 at 09:43:33AM -0700, Andy Lutomirski wrote:
> > enter the critical section:
> > 1:
> > movq %[cpu], %%r12
> > movq {address of counter for our cpu}, %%r13
> > movq {some fresh value}, (%%r13)
> > cmpq %[cpu], %%r12
> > jne 1b
>
> This is inherently racy; your forgot the detail of 'some fresh value',
> but since you want to avoid collisions you really want an increment.
>
> But load-store archs cannot do that. Or rather, they need to do:
>
>         load    Rn, $event
>         add     Rn, Rn, 1
>         store   $event, Rn
>
> But if they're preempted in the middle, two threads will collide and
> generate the _same_ increment. Comparing CPU numbers will not fix that.

Even on x86 this won't work -- we have no actual guarantee we're on
the right CPU, so we'd have to use an atomic.

I was thinking we'd allocate from a per-thread pool (say 24 bits of
thread ID and the rest being a nonce).  On load-store architectures
this wouldn't be async-signal-safe, though.  Hmm.

--Andy