[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160804042710.GA13232@tardis.cn.ibm.com>
Date: Thu, 4 Aug 2016 12:27:10 +0800
From: Boqun Feng <boqun.feng@...il.com>
To: Andy Lutomirski <luto@...capital.net>
Cc: Peter Zijlstra <peterz@...radead.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Russell King <linux@....linux.org.uk>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
linux-api <linux-api@...r.kernel.org>,
Paul Turner <pjt@...gle.com>, Andrew Hunter <ahh@...gle.com>,
Andi Kleen <andi@...stfloor.org>,
Dave Watson <davejwatson@...com>, Chris Lameter <cl@...ux.com>,
Ben Maurer <bmaurer@...com>, rostedt <rostedt@...dmis.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Josh Triplett <josh@...htriplett.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
Michael Kerrisk <mtk.manpages@...il.com>
Subject: Re: [RFC PATCH v7 1/7] Restartable sequences system call
On Wed, Aug 03, 2016 at 09:37:57AM -0700, Andy Lutomirski wrote:
> On Wed, Aug 3, 2016 at 5:27 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Tue, Jul 26, 2016 at 03:02:19AM +0000, Mathieu Desnoyers wrote:
> >> We really care about preemption here. Every migration implies a
> >> preemption from a user-space perspective. If we would only care
> >> about keeping the CPU id up-to-date, hooking into migration would be
> >> enough. But since we want atomicity guarantees for restartable
> >> sequences, we need to hook into preemption.
> >
> >> It allows user-space to perform update operations on per-cpu data without
> >> requiring heavy-weight atomic operations.
> >
> > Well, a CMPXCHG without LOCK prefix isn't all that expensive on x86.
> >
> > It is however on PPC and possibly other architectures, so in name of
> > simplicity supporting only the one variant makes sense.
> >
>
> I wouldn't want to depend on CMPXCHG. But imagine we had primitives
> that were narrower than the full abort-on-preemption primitive.
> Specifically, suppose we had abort if (actual cpu != expected_cpu ||
> *aptr != aval). We could do things like:
>
> expected_cpu = cpu;
> aval = NULL; // disarm for now
> begin();
> aval = event_count[cpu] + 1;
> event_count[cpu] = aval;
> event_count[cpu]++;
This line is redundant, right? Because it will guarantee a failure even
in no-contention cases.
>
> ... compute something ...
>
> // arm the rest of it
> aptr = &event_count[cpu];
> if (*aptr != aval)
> goto fail;
>
> *thing_im_writing = value_i_computed;
> end();
>
> The idea here is that we don't rely on the scheduler to increment the
> event count at all, which means that we get to determine the scope of
> what kinds of access conflicts we care about ourselves.
>
If we increase the event count in userspace, how could we prevent two
userspace threads from racing on the event_count[cpu] field? For
example:
CPU 0
================
{event_count[0] is initially 0}
[Thread 1]
begin();
aval = event_count[cpu] + 1; // 1
(preempted)
[Thread 2]
begin();
aval = event_count[cpu] + 1; // 1, too
event_count[cpu] = aval; // event_count[0] is 1
(preempted)
[Thread 1]
event_count[cpu] = aval; // event_count[0] is 1
...
aptr = &event_count[cpu];
if (*aptr != aval) // false.
...
[Thread 2]
aptr = &event_count[cpu];
if (*aptr != aval) // false.
...
, in which case, both the critical sections are successful, and Thread 1
and Thread 2 will race on *thing_im_writing.
Am I missing your point here?
Regards,
Boqun
> This has an obvious downside: it's more complicated.
>
> It has several benefits, I think. It's debuggable without hassle
> (unless someone, accidentally or otherwise, sets aval incorrectly).
> It also allows much longer critical sections to work well, as merely
> being preempted in the middle won't cause an abort any more.
>
> So I'm hoping to understand whether we could make something like this
> work. This whole thing is roughly equivalent to abort-if-migrated
> plus an atomic "if (*aptr == aval) *b = c;" operation.
>
> (I think that, if this worked, we could improve it a bit by making the
> abort operation jump back to the "if (*aptr != aval) goto fail;" code,
> which should reduce the scope for error a bit and also reduces the
> need for extra code paths that only execute on an abort.)
Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)
Powered by blists - more mailing lists