[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87d02qqfxy.fsf@mpe.ellerman.id.au>
Date: Sun, 13 Sep 2020 17:44:09 +1000
From: Michael Ellerman <mpe@...erman.id.au>
To: Kees Cook <keescook@...omium.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Robert O'Callahan <rocallahan@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
"maintainer\:X86 ARCHITECTURE \(32-BIT AND 64-BIT\)" <x86@...nel.org>,
linux-arch@...r.kernel.org, Will Deacon <will@...nel.org>,
Arnd Bergmann <arnd@...db.de>,
Mark Rutland <mark.rutland@....com>,
Keno Fischer <keno@...iacomputing.com>,
Paolo Bonzini <pbonzini@...hat.com>,
kvm list <kvm@...r.kernel.org>,
Gabriel Krisman Bertazi <krisman@...labora.com>,
Sean Christopherson <sean.j.christopherson@...el.com>,
Kyle Huey <me@...ehuey.com>
Subject: Re: [REGRESSION] x86/entry: Tracer no longer has opportunity to change the syscall number at entry via orig_ax
Kees Cook <keescook@...omium.org> writes:
> On Wed, Sep 09, 2020 at 11:53:42PM +1000, Michael Ellerman wrote:
>> I can observe the difference between v5.8 and mainline, using the
>> raw_syscall trace event and running the seccomp_bpf selftest which turns
>> a getpid (39) into a getppid (110).
>>
>> With v5.8 we see getppid on entry and exit:
>>
>> seccomp_bpf-1307 [000] .... 22974.874393: sys_enter: NR 110 (7ffff22c46e0, 40a350, 4, fffffffffffff7ab, 7fa6ee0d4010, 0)
>> seccomp_bpf-1307 [000] .N.. 22974.874401: sys_exit: NR 110 = 1304
>>
>> Whereas on mainline we see an enter for getpid and an exit for getppid:
>>
>> seccomp_bpf-1030 [000] .... 21.806766: sys_enter: NR 39 (7ffe2f6d1ad0, 40a350, 7ffe2f6d1ad0, 0, 0, 407299)
>> seccomp_bpf-1030 [000] .... 21.806767: sys_exit: NR 110 = 1027
>
> For my own notes, this is how I reproduced it:
>
> # ./perf-$VER record -e raw_syscalls:sys_enter -e raw_syscalls:sys_exit &
> # ./seccomp_bpf
> # fg
> ctrl-c
> # ./perf-$VER script | grep seccomp_bpf | awk '{print $7}' | sort | uniq -c > $VER.log
> *repeat*
> # diff -u old.log new.log
> ...
>
> (Is there an easier way to get those results?)
I did more or less the same thing, except I ran the trace event manually
(via debugfs), which is no better really.
I think the right way to test it would be to have a test that modifies
the syscall via seccomp and also monitors the trace event using perf
events. But that wouldn't be easier :)
> I will go see if I can figure out the best way to correct this.
I think this works?
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 18683598edbc..901361e2f8ea 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -60,13 +60,15 @@ static long syscall_trace_enter(struct pt_regs *regs, long syscall,
return ret;
}
+ syscall = syscall_get_nr(current, regs);
+
if (unlikely(ti_work & _TIF_SYSCALL_TRACEPOINT))
trace_sys_enter(regs, syscall);
syscall_enter_audit(regs, syscall);
/* The above might have changed the syscall number */
- return ret ? : syscall_get_nr(current, regs);
+ return ret ? : syscall;
}
static __always_inline long
cheers
Powered by blists - more mailing lists