[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrV2SCLg1tUUbC1SeQhyn9097ktncEKvd=jh2woSZ3g8ow@mail.gmail.com>
Date: Tue, 1 Dec 2020 16:04:24 -0800
From: Andy Lutomirski <luto@...nel.org>
To: Gabriel Krisman Bertazi <krisman@...labora.com>
Cc: Andrew Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Kees Cook <keescook@...omium.org>,
Paul Gofman <gofmanp@...il.com>,
Christian Brauner <christian.brauner@...ntu.com>,
Peter Zijlstra <peterz@...radead.org>,
Matthew Wilcox <willy@...radead.org>,
Shuah Khan <shuah@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@...r.kernel.org>, X86 ML <x86@...nel.org>,
kernel@...labora.com
Subject: Re: [PATCH v8 4/7] entry: Support Syscall User Dispatch on common
syscall entry
On Fri, Nov 27, 2020 at 11:33 AM Gabriel Krisman Bertazi
<krisman@...labora.com> wrote:
>
> Syscall User Dispatch (SUD) must take precedence over seccomp and
> ptrace, since the use case is emulation (it can be invoked with a
> different ABI) such that seccomp filtering by syscall number doesn't
> make sense in the first place. In addition, either the syscall is
> dispatched back to userspace, in which case there is no resource for to
> trace, or the syscall will be executed, and seccomp/ptrace will execute
> next.
>
> Since SUD runs before tracepoints, it needs to be a SYSCALL_WORK_EXIT as
> well, just to prevent a trace exit event when dispatch was triggered.
> For that, the on_syscall_dispatch() examines context to skip the
> tracepoint, audit and other work.
>
> Signed-off-by: Gabriel Krisman Bertazi <krisman@...labora.com>
> Acked-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
> Changes since v6:
> - Update do_syscall_intercept signature (Christian Brauner)
> - Move it to before tracepoints
> - Use SYSCALL_WORK flags
> ---
> include/linux/entry-common.h | 2 ++
> kernel/entry/common.c | 17 +++++++++++++++++
> 2 files changed, 19 insertions(+)
>
> diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
> index 49b26b216e4e..a6e98b4ba8e9 100644
> --- a/include/linux/entry-common.h
> +++ b/include/linux/entry-common.h
> @@ -44,10 +44,12 @@
> SYSCALL_WORK_SYSCALL_TRACE | \
> SYSCALL_WORK_SYSCALL_EMU | \
> SYSCALL_WORK_SYSCALL_AUDIT | \
> + SYSCALL_WORK_SYSCALL_USER_DISPATCH | \
> ARCH_SYSCALL_WORK_ENTER)
> #define SYSCALL_WORK_EXIT (SYSCALL_WORK_SYSCALL_TRACEPOINT | \
> SYSCALL_WORK_SYSCALL_TRACE | \
> SYSCALL_WORK_SYSCALL_AUDIT | \
> + SYSCALL_WORK_SYSCALL_USER_DISPATCH | \
> ARCH_SYSCALL_WORK_EXIT)
>
> /*
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index f1b12dc32ff4..ec20aba3b890 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -6,6 +6,8 @@
> #include <linux/livepatch.h>
> #include <linux/audit.h>
>
> +#include "common.h"
> +
> #define CREATE_TRACE_POINTS
> #include <trace/events/syscalls.h>
>
> @@ -47,6 +49,16 @@ static long syscall_trace_enter(struct pt_regs *regs, long syscall,
> {
> long ret = 0;
>
> + /*
> + * Handle Syscall User Dispatch. This must comes first, since
> + * the ABI here can be something that doesn't make sense for
> + * other syscall_work features.
> + */
> + if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
> + if (do_syscall_user_dispatch(regs))
> + return -1L;
> + }
> +
> /* Handle ptrace */
> if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) {
> ret = arch_syscall_enter_tracehook(regs);
> @@ -232,6 +244,11 @@ static void syscall_exit_work(struct pt_regs *regs, unsigned long work)
> {
> bool step;
>
> + if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
> + if (on_syscall_dispatch())
> + return;
> + }
I think this would be less confusing if you just open-coded the body
of on_syscall_dispatch here and got rid of the helper.
--Andy
Powered by blists - more mailing lists