[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87v91kqt6b.fsf@disp2133>
Date: Mon, 25 Oct 2021 23:57:48 -0500
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Andy Lutomirski <luto@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-arch <linux-arch@...r.kernel.org>,
Oleg Nesterov <oleg@...hat.com>,
Al Viro <viro@...iv.linux.org.uk>,
Kees Cook <keescook@...omium.org>
Subject: Re: [PATCH 13/20] signal: Implement force_fatal_sig
Linus Torvalds <torvalds@...ux-foundation.org> writes:
> On Mon, Oct 25, 2021 at 3:41 PM Andy Lutomirski <luto@...nel.org> wrote:
>>
>> I'm rather nervous about all this, and I'm also nervous about the
>> existing code. A quick skim is finding plenty of code paths that assume
>> force_sigsegv (or a do_exit that this series touches) are genuinely
>> unrecoverable.
>
> I was going to say "what are you talking about", because clearly Eric
> kept it all fatal.
>
> But then looked at that patch a bit more before I claimed you were wrong.
>
> And yeah, Eric's force_fatal_sig() is completely broken.
>
> It claims to force a fatal signal, but doesn't actually do that at
> all, and is completely misnamed.
>
> It just uses "force_sig_info_to_task()", which still allows user space
> to catch signals - so it's not "fatal" in the least. It only punches
> through SIG_IGN and blocked signals.
Rereading this I think you might be misreading something.
force_siginfo_to_task takes a sigdfl parameter which I am setting in
force_fatal_signal.
When that sigdfl paramter is set force_siginfo_to_task always changes
the signal handler to SIGDFL, and always unblocks the signal.
Because the siglock remains held over send_signal none of those
properties can change during send_signal. Which means that as long
as we are not talking about a coredump signal complete_signal is
guaranteed to recognize the signal as fatal immediately.
For coredump signals there is a race where siglock is dropped before
get_signal is called that could result in the signal handler being
changed or the signal being blocked. Which is why I pointed out the
problem is coredumps.
But assuming userspace does not change something in that narrow window
the signal will most definitely be fatal to the target process.
Just as soon as I know if we can have per signal_struct coredumps
without causing regressions I will close the final race. I can do it
either way but the code is much less complicated with per signal_struct
coredumps.
Hoisting the current zap_threads from fs/coredump.c into complete_signal
is a pain and a half. While just the per_signal struct part is already
there, and the code just needs a few tweaks to allow get_signal to act
as the coredump rendezvous location.
Eric
Powered by blists - more mailing lists