[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKOZueu_7Ha_WXMyxqMEScXo1aHWr9qYRxqyb-Rpd4k1JP3xHA@mail.gmail.com>
Date: Wed, 02 May 2018 16:07:48 +0000
From: Daniel Colascione <dancol@...gle.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
boqun.feng@...il.com, luto@...capital.net, davejwatson@...com,
linux-kernel@...r.kernel.org, linux-api@...r.kernel.org,
Paul Turner <pjt@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux@....linux.org.uk, tglx@...utronix.de, mingo@...hat.com,
hpa@...or.com, Andrew Hunter <ahh@...gle.com>, andi@...stfloor.org,
cl@...ux.com, bmaurer@...com, rostedt@...dmis.org,
josh@...htriplett.org, torvalds@...ux-foundation.org,
catalin.marinas@....com, will.deacon@....com,
Michael Kerrisk-manpages <mtk.manpages@...il.com>,
Joel Fernandes <joelaf@...gle.com>
Subject: Re: [RFC PATCH for 4.18 00/14] Restartable Sequences
On Wed, May 2, 2018 at 9:03 AM Mathieu Desnoyers <
mathieu.desnoyers@...icios.com> wrote:
> ----- On May 1, 2018, at 11:53 PM, Daniel Colascione dancol@...gle.com
wrote:
> [...]
> >
> > I think a small enhancement to rseq would let us build a perfect
userspace
> > mutex, one that spins on lock-acquire only when the lock owner is
running
> > and that sleeps otherwise, freeing userspace from both specifying ad-hoc
> > spin counts and from trying to detect situations in which spinning is
> > generally pointless.
> >
> > It'd work like this: in the per-thread rseq data structure, we'd
include a
> > description of a futex operation for the kernel would perform (in the
> > context of the preempted thread) upon preemption, immediately before
> > schedule(). If the futex operation itself sleeps, that's no problem: we
> > will have still accomplished our goal of running some other thread
instead
> > of the preempted thread.
> Hi Daniel,
> I agree that the problem you are aiming to solve is important. Let's see
> what prevents the proposed rseq implementation from doing what you
envision.
> The main issue here is touching userspace immediately before schedule().
> At that specific point, it's not possible to take a page fault. In the
proposed
> rseq implementation, we get away with it by raising a task struct flag,
and using
> it in a return to userspace notifier (where we can actually take a
fault), where
> we touch the userspace TLS area.
> If we can find a way to solve this limitation, then the rest of your
design
> makes sense to me.
Thanks for taking a look!
Why couldn't we take a page fault just before schedule? The reason we can't
take a page fault in atomic context is that doing so might call schedule.
Here, we're about to call schedule _anyway_, so what harm does it do to
call something that might call schedule? If we schedule via that call, we
can skip the manual schedule we were going to perform.
Powered by blists - more mailing lists