linux-kernel - Re: siginfo pid not populated from ptrace?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181129232245.GC4676@cisco>
Date:   Thu, 29 Nov 2018 16:22:45 -0700
From:   Tycho Andersen <tycho@...ho.ws>
To:     Kees Cook <keescook@...omium.org>
Cc:     "Eric W. Biederman" <ebiederm@...ssion.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Oleg Nesterov <oleg@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: siginfo pid not populated from ptrace?

On Thu, Nov 29, 2018 at 01:17:01PM -0800, Kees Cook wrote:
> On Tue, Nov 27, 2018 at 8:44 PM Eric W. Biederman <ebiederm@...ssion.com> wrote:
> >
> > Kees Cook <keescook@...omium.org> writes:
> >
> > > On Tue, Nov 27, 2018 at 4:38 PM, Kees Cook <keescook@...omium.org> wrote:
> > >> On Tue, Nov 27, 2018 at 3:21 PM, Tycho Andersen <tycho@...ho.ws> wrote:
> > >>> On Mon, Nov 12, 2018 at 12:24:43PM -0700, Tycho Andersen wrote:
> > >>>> On Mon, Nov 12, 2018 at 11:55:38AM -0700, Tycho Andersen wrote:
> > >>>> > I haven't manage to reproduce it on stock v4.20-rc2, unfortunately.
> > >>>>
> > >>>> Ok, now I have,
> > >>>>
> > >>>> seccomp_bpf.c:2736:global.syscall_restart:Expected getpid() (1493) == info._sifields._kill.si_pid (0)
> > >>>> global.syscall_restart: Test failed at step #22
> > >>>
> > >>> Seems like this is still happening on v4.20-rc4,
> > >>>
> > >>> [ RUN      ] global.syscall_restart
> > >>> seccomp_bpf.c:2736:global.syscall_restart:Expected getpid() (1901) == info._sifields._kill.si_pid (0)
> > >>> global.syscall_restart: Test failed at step #22
> > >>
> > >> This fails every time for me -- is it still racey for you?
> > >>
> > >> I'm attempting a bisect, hoping it doesn't _become_ racey for me. ;)
> > >
> > > This bisect to here for me:
> > >
> > > commit f149b31557446aff9ca96d4be7e39cc266f6e7cc
> > > Author: Eric W. Biederman <ebiederm@...ssion.com>
> > > Date:   Mon Sep 3 09:50:36 2018 +0200
> > >
> > >     signal: Never allocate siginfo for SIGKILL or SIGSTOP
> > >
> > >     The SIGKILL and SIGSTOP signals are never delivered to userspace so
> > >     queued siginfo for these signals can never be observed.  Therefore
> > >     remove the chance of failure by never even attempting to allocate
> > >     siginfo in those cases.
> > >
> > >     Reviewed-by: Thomas Gleixner <tglx@...utronix.de>
> > >     Signed-off-by: "Eric W. Biederman" <ebiederm@...ssion.com>
> > >
> > > They are certainly visible via seccomp ;)
> >
> > Well SIGSTOP is visible via PTRACE_GETSIGINFO.
> >
> > I see what is happening now.  Since we don't have queued siginfo
> > we generate some as:
> >                 /*
> >                  * Ok, it wasn't in the queue.  This must be
> >                  * a fast-pathed signal or we must have been
> >                  * out of queue space.  So zero out the info.
> >                  */
> >                 clear_siginfo(info);
> >                 info->si_signo = sig;
> >                 info->si_errno = 0;
> >                 info->si_code = SI_USER;
> >                 info->si_pid = 0;
> >                 info->si_uid = 0;
> >
> > Which allows last_signfo to be set,
> > so despite not really having any siginfo PTRACE_GET_SIGINFO
> > has something to return so does not return -EINVAL.
> >
> > Reconstructing my context that was part of removing SEND_SIG_FORCED
> > so this looks like it will take a little more than a revert to fix
> > this.
> >
> > This is definitely a change that is visible to user space.  The logic in
> > my patch was definitely wrong with respect to SIGSTOP and
> > PTRACE_GETSIGINFO.  Is there something in userspace that actually cares?
> > AKA is the idiom that the test seccomp_bpf.c is using something that
> > non-test code does?
> 
> I think this would be needed by any ptracer that handled multiple
> threads. It needs to figure out which pid stopped. I think it's worth
> fixing, yes.
> 
> > The change below should restore the old behavior.  I am just wondering
> > if this is something we want to do.  siginfo is allocated with
> > GFP_ATOMIC so if your machine is under memory pressure there is a real
> > chance the allocation can fail.  Which would cause whatever is breaking
> > now to break less deterministically then.
> 
> I think memory pressure that would block a 128 byte GFP_ATOMIC
> allocation would mean the system was about to seriously fall over.
> Given the user-facing behavior change and that an existing test was
> already checking for this means we need to fix it.
> 
> > If we need to fix this do we need to make siginfo allocation more
> > reliable?
> 
> I don't think so -- we'd already get a WARN() if allocation failed.
> 
> > Eric
> >
> >
> > diff --git a/kernel/signal.c b/kernel/signal.c
> > index 4fd431ce4f91..5c34c55bfea4 100644
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -1057,10 +1057,10 @@ static int __send_signal(int sig, struct kernel_siginfo *info, struct task_struc
> >
> >         result = TRACE_SIGNAL_DELIVERED;
> >         /*
> > -        * Skip useless siginfo allocation for SIGKILL SIGSTOP,
> > +        * Skip useless siginfo allocation for SIGKILL,
> >          * and kernel threads.
> >          */
> > -       if (sig_kernel_only(sig) || (t->flags & PF_KTHREAD))
> > +       if ((sig == SIGKILL) || (t->flags & PF_KTHREAD))
> >                 goto out_set;
> >
> >         /*
> >
> 
> This fixes it for me!
> 
> Reported-by: Tycho Andersen <tycho@...ho.ws>
> Tested-by: Kees Cook <keescook@...omium.org>
> Fixes: f149b3155744 ("signal: Never allocate siginfo for SIGKILL or SIGSTOP")

Thanks guys, it works for me too.

Tested-by: Tycho Andersen <tycho@...ho.ws>

Tycho