[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202009111156.660A7C2978@keescook>
Date: Fri, 11 Sep 2020 11:58:30 -0700
From: Kees Cook <keescook@...omium.org>
To: Michael Ellerman <mpe@...erman.id.au>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Robert O'Callahan <rocallahan@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
linux-arch@...r.kernel.org, Will Deacon <will@...nel.org>,
Arnd Bergmann <arnd@...db.de>,
Mark Rutland <mark.rutland@....com>,
Keno Fischer <keno@...iacomputing.com>,
Paolo Bonzini <pbonzini@...hat.com>,
kvm list <kvm@...r.kernel.org>,
Gabriel Krisman Bertazi <krisman@...labora.com>,
Sean Christopherson <sean.j.christopherson@...el.com>,
Kyle Huey <me@...ehuey.com>
Subject: Re: [REGRESSION] x86/entry: Tracer no longer has opportunity to
change the syscall number at entry via orig_ax
On Wed, Sep 09, 2020 at 11:53:42PM +1000, Michael Ellerman wrote:
> Hi Thomas,
>
> Sorry if this was discussed already somewhere, but I didn't see anything ...
>
> Thomas Gleixner <tglx@...utronix.de> writes:
> > On Wed, Aug 19 2020 at 10:14, Kyle Huey wrote:
> >> tl;dr: after 27d6b4d14f5c3ab21c4aef87dd04055a2d7adf14 ptracer
> >> modifications to orig_ax in a syscall entry trace stop are not honored
> >> and this breaks our code.
> ...
> > diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> > index 9852e0d62d95..fcae019158ca 100644
> > --- a/kernel/entry/common.c
> > +++ b/kernel/entry/common.c
> > @@ -65,7 +65,8 @@ static long syscall_trace_enter(struct pt_regs *regs, long syscall,
>
> Adding context:
>
> /* Do seccomp after ptrace, to catch any tracer changes. */
> if (ti_work & _TIF_SECCOMP) {
> ret = __secure_computing(NULL);
> if (ret == -1L)
> return ret;
> }
>
> if (unlikely(ti_work & _TIF_SYSCALL_TRACEPOINT))
> trace_sys_enter(regs, syscall);
>
> > syscall_enter_audit(regs, syscall);
> >
> > - return ret ? : syscall;
> > + /* The above might have changed the syscall number */
> > + return ret ? : syscall_get_nr(current, regs);
> > }
> >
> > noinstr long syscall_enter_from_user_mode(struct pt_regs *regs, long syscall)
>
> I noticed if the syscall number is changed by seccomp/ptrace, the
> original syscall number is still passed to trace_sys_enter() and audit.
>
> The old code used regs->orig_ax, so any change to the syscall number
> would be seen by the tracepoint and audit.
Ah! That's no good.
> I can observe the difference between v5.8 and mainline, using the
> raw_syscall trace event and running the seccomp_bpf selftest which turns
> a getpid (39) into a getppid (110).
>
> With v5.8 we see getppid on entry and exit:
>
> seccomp_bpf-1307 [000] .... 22974.874393: sys_enter: NR 110 (7ffff22c46e0, 40a350, 4, fffffffffffff7ab, 7fa6ee0d4010, 0)
> seccomp_bpf-1307 [000] .N.. 22974.874401: sys_exit: NR 110 = 1304
>
> Whereas on mainline we see an enter for getpid and an exit for getppid:
>
> seccomp_bpf-1030 [000] .... 21.806766: sys_enter: NR 39 (7ffe2f6d1ad0, 40a350, 7ffe2f6d1ad0, 0, 0, 407299)
> seccomp_bpf-1030 [000] .... 21.806767: sys_exit: NR 110 = 1027
>
>
> I don't know audit that well, but I think it saves the syscall number on
> entry eg. in __audit_syscall_entry(). So it will record the wrong
> syscall happening in this case I think.
>
> Seems like we should reload the syscall number before calling
> trace_sys_enter() & audit ?
Agreed. I wonder what the best way to build a regression test for this
is... hmmm.
--
Kees Cook
Powered by blists - more mailing lists