[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1779765540.20682.1589424713646.JavaMail.zimbra@efficios.com>
Date: Wed, 13 May 2020 22:51:53 -0400 (EDT)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: rostedt <rostedt@...dmis.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
x86 <x86@...nel.org>, paulmck <paulmck@...nel.org>,
Andy Lutomirski <luto@...nel.org>,
Alexandre Chartre <alexandre.chartre@...cle.com>,
Frederic Weisbecker <frederic@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <sean.j.christopherson@...el.com>,
Masami Hiramatsu <mhiramat@...nel.org>,
Petr Mladek <pmladek@...e.com>,
"Joel Fernandes, Google" <joel@...lfernandes.org>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Juergen Gross <jgross@...e.com>,
Brian Gerst <brgerst@...il.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Will Deacon <will@...nel.org>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [patch V4 part 1 05/36] x86/entry: Flip _TIF_SIGPENDING and
_TIF_NOTIFY_RESUME handling
----- On May 13, 2020, at 8:12 PM, Thomas Gleixner tglx@...utronix.de wrote:
[...]
>
>>> Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
>
>>> Also, color me confused: is "do_signal()" actually running any user-space,
>>> or just setting up the user-space stack for eventual return to signal
>>> handler ?
>
> I'm surprised that you can't answer that question yourself. How did you
> ever make rseq work and how did rseq_signal_deliver() end up in
> setup_rt_frame()?
>
> Hint: Tracing might answer that question
>
> And to cut it short:
>
> Exit to user space happnes only through ONE channel, i.e. leaving
> prepare_exit_to usermode().
>
[...]
Yes, I'm very well aware of this. But the patch commit message states:
"Make sure task_work runs before any kind of userspace -- very much
including signals -- is invoked."
which seems to imply that "userspace" can be "invoked" before the task_work
runs. Which makes no sense whatsoever. Hence my confused state.
>>> Also, it might be OK, but we're changing the order of two things which
>>> have effects on each other: restartable sequences abort fixup for preemption
>>> and do_signal(), which also have effects on rseq abort.
>>>
>>> Because those two will cause the abort to trigger, I suspect changing
>>> the order might be OK, but we really need to think this through.
>
> That's a purely academic problem. The order is completely
> irrelevant. You have to handle any order anyway:
Yes indeed, whether either a signal handler frame fixup or return IP
fixup fires first (clearing the rseq_cs pointer in the process) is
irrelevant, because they will have the effect on the user-space program's
flow. And as you say, given it is run in a loop and can be preempted,
any order can happen here, so we have to be prepared for it. This loop
has caused me tons of headaches when stress-testing on NUMA machines by
the way.
> That said, even for the case Andy and Peter were looking at (MCE) the
> ordering is completely irrelevant.
Not sure about that, see Andy's follow up.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists