[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87v9kzz862.fsf@nanos.tec.linutronix.de>
Date: Thu, 14 May 2020 02:12:21 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Steven Rostedt <rostedt@...dmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: linux-kernel <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>,
paulmck <paulmck@...nel.org>, Andy Lutomirski <luto@...nel.org>,
Alexandre Chartre <alexandre.chartre@...cle.com>,
Frederic Weisbecker <frederic@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <sean.j.christopherson@...el.com>,
Masami Hiramatsu <mhiramat@...nel.org>,
Petr Mladek <pmladek@...e.com>,
"Joel Fernandes\, Google" <joel@...lfernandes.org>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Juergen Gross <jgross@...e.com>,
Brian Gerst <brgerst@...il.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Will Deacon <will@...nel.org>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [patch V4 part 1 05/36] x86/entry: Flip _TIF_SIGPENDING and _TIF_NOTIFY_RESUME handling
Steven, Mathieu
(combo reply)
Steven Rostedt <rostedt@...dmis.org> writes:
> On Wed, 13 May 2020 16:56:41 -0400 (EDT)
>> > + /* deal with pending signal delivery */
>> > + if (cached_flags & _TIF_SIGPENDING)
>> > + do_signal(regs);
>
> Looking deeper into this, it appears that do_signal() can freeze or kill the
> task.
>
> That is, it wont go back to user space here, but simply schedule out (being
> traced) or even exit (killed).
>
> Before the resume hooks would never be called in such cases, and now they
> are.
It theoretically matters because pending task work might kill the
task. That's the concern Andy and Peter had. Assume the following:
usermode
-> exception
set not fatal signal
-> exception
queue task work to kill task
<- return
<- return
The same could happen when the non fatal signal is set from a remote CPU.
So in theory that would result in:
handle non fatal signal first
handle task work which kills task
which would be the wrong order.
But that's just illusion.
>> Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:
>> Also, color me confused: is "do_signal()" actually running any user-space,
>> or just setting up the user-space stack for eventual return to signal
>> handler ?
I'm surprised that you can't answer that question yourself. How did you
ever make rseq work and how did rseq_signal_deliver() end up in
setup_rt_frame()?
Hint: Tracing might answer that question
And to cut it short:
Exit to user space happnes only through ONE channel, i.e. leaving
prepare_exit_to usermode().
exit_to_usermode_loop <-prepare_exit_to_usermode
do_signal <-exit_to_usermode_loop
get_signal <-do_signal
setup_sigcontext <-do_signal
do_syscall_64 <-entry_SYSCALL_64_after_hwframe
syscall_trace_enter <-do_syscall_64
sys_rt_sigreturn()
restore_altstack <-__ia32_sys_rt_sigreturn
syscall_slow_exit_work <-do_syscall_64
exit_to_usermode_loop <-do_syscall_64
>> Also, it might be OK, but we're changing the order of two things which
>> have effects on each other: restartable sequences abort fixup for preemption
>> and do_signal(), which also have effects on rseq abort.
>>
>> Because those two will cause the abort to trigger, I suspect changing
>> the order might be OK, but we really need to think this through.
That's a purely academic problem. The order is completely
irrelevant. You have to handle any order anyway:
usermode
-> exception / syscall
sets signal
<- return
prepare_exit_to_usemode()
cached_flags = READ_ONCE(t->flags);
exit_to_user_mode_loop(regs, cached_flags) {
while (cached_flags) {
local_irq_enable();
handle(cached_flags & RESCHED);
handle(cached_flags & UPROBE);
handle(cached_flags & PATCHING);
handle(cached_flags & SIGNAL);
handle(cached_flags & NOTIFY_RESUME);
handle(cached_flags & RETURN_NOTIFY);
local_irq_disable();
cached_flags = READ_ONCE(t->flags);
}
cached_flag is a momentary snapshot when attempting to return to user
space.
But after reenabling interrupts any of the relevant flag bits can be set
by an exception/interrupt or from remote. If preemption is enabled the
task can be scheduled out, migrated at any point before disabling
interrupts again. Even after disabling interrupts and before re-reading
cached flags there might be a remote change of flags.
That said, even for the case Andy and Peter were looking at (MCE) the
ordering is completely irrelevant.
I should have thought about this before, so thanks to both of you for
making me look at it again for the very wrong reasons.
Consider the patch dropped.
Thanks,
tglx
Powered by blists - more mailing lists