linux-kernel - [RFC] another signal oddity

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <YTP+u6Kb3xguT0sN@zeniv-ca.linux.org.uk>
Date:   Sat, 4 Sep 2021 23:18:19 +0000
From:   Al Viro <viro@...iv.linux.org.uk>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Oleg Nesterov <oleg@...hat.com>, Kyle Huey <me@...ehuey.com>,
        Kees Cook <keescook@...omium.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        linux-kernel@...r.kernel.org
Subject: [RFC] another signal oddity

	Suppose we are sending e.g. SIGINT to a process
(by kill(2)).  The target has two threads -
	* thread1 (leader) that has SIGINT blocked
	* thread2 that does *not* have SIGINT blocked
	* thread2 is ptraced and running (not in ptrace stop).
	* handler for SIGINT is SIG_DFL.

complete_signal() is called.  want_signal(SIGINT, thread1)
is false.  type is not PIDTYPE_PID and thread_group_empty()
is false.  want_signal(SIGINT, thread2) is true, so we end
up with signal->curr_target and t set to thread2.
p is thread1.

And then we hit this:
        if (sig_fatal(p, sig) &&
True - the handler is SIG_DFL and unhandled SIGINT is fatal
            !(signal->flags & SIGNAL_GROUP_EXIT) &&
True - we are not in group exit.
            !sigismember(&t->real_blocked, sig) &&
True - nobody is in sigtimedwait(), so ->real_blocked is empty.
            (sig == SIGKILL || !p->ptrace)) {
Also true - thread1 is not ptraced.

So we go ahead and initiate a group exit.  Both thread1 and
thread2 get SIGKILL added to ->blocked and are woken up.

But AFAICS we have no business doing that - thread1 has SIGINT
blocked, so get_signal() in it would not pick that SIGINT.
And thread2 is traced, so picking SIGINT would've hit
ptrace_signal(), stop and let the tracer deal with it.  If
the tracer decides to cancel that SIGINT, we would continue
just fine.

Which order of execution could possibly lead to fatal signal
delivery?

IDGI...  Looks like that !p->ptrace used to be !t->ptrace until
426915796cca "kernel/signal.c: remove the no longer needed
SIGNAL_UNKILLABLE check in complete_signal()" back in 2017,
but I don't see anything in commit message that would explain
that part of changes.  The testcase in there wouldn't care
either way...

What am I missing here?