[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20221206020540.3j5x7fs7uuarzct5@macbook-pro-6.dhcp.thefacebook.com>
Date: Mon, 5 Dec 2022 18:05:40 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Masami Hiramatsu <mhiramat@...nel.org>
Cc: Theodore Ts'o <tytso@....edu>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Chris Mason <clm@...a.com>,
Steven Rostedt <rostedt@...dmis.org>,
Borislav Petkov <bp@...en8.de>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Kees Cook <keescook@...omium.org>,
Josh Poimboeuf <jpoimboe@...hat.com>,
KP Singh <kpsingh@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Florent Revest <revest@...omium.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Christoph Hellwig <hch@...radead.org>,
Benjamin Tissoires <benjamin.tissoires@...hat.com>
Subject: Re: [PATCH] error-injection: Add prompt for function error injection
On Mon, Dec 05, 2022 at 07:50:15AM +0900, Masami Hiramatsu wrote:
> On Fri, 2 Dec 2022 13:27:11 -0800
> Alexei Starovoitov <alexei.starovoitov@...il.com> wrote:
>
> > On Fri, Dec 02, 2022 at 10:56:52AM -0500, Theodore Ts'o wrote:
> > > On Thu, Dec 01, 2022 at 05:41:29PM -0800, Alexei Starovoitov wrote:
> > > >
> > > > The fault injection framework disables individual syscall with zero performance
> > > > overhead comparing to LSM and seccomp mechanisms.
> > > > BPF is not involved here. It's a kprobe in one spot.
> > > > All other syscalls don't notice it.
> > > > It's an attractive way to improve security.
> > > >
> > > > A BPF prog over syscall can filter by user, cgroup, task and give fine grain
> > > > control over security surface.
> > > > tbh I'm not aware of folks doing "syscall disabling" through command line like
> > > > above (I've only seen it through bpf), but it doesn't mean that somebody will
> > > > not start complaining that their script broke, because distro disabled fault
> > > > injection.
> > > >
> > > > So should we split FUNCTION_ERROR_INJECTION kconfig into two ?
> > > > And do default N for things like should_failslab() and
> > > > default Y for syscalls?
> > >
> > > How about calling the latter something like bpf syscall hooks, and not
> > > using the terminology "error injection" in relation to system calls?
> > > I think that might be less confusing.
> >
> > I think 'syscall error injection' name fits well.
> > It's a generic feature that both kprobes and bpf should be able to use.
> > Here is the patch...
> >
> > Even with this patch we have 7 failures in BPF selftests.
> > We will fix them later with the same mechanism as we will pick for hid-bpf.
> >
> > This patch will keep 'syscall disabling' scripts working
> > and bpf syscall adjustment will work too.
> > So no chance of breaking anyone.
> > While actual error injection inside the kernel will be disabled.
> >
> > Better name suggestions are welcome, of course.
> >
> > From 2960958f91d1134b1a8f27787875f6b9300f205e Mon Sep 17 00:00:00 2001
> > From: Alexei Starovoitov <ast@...nel.org>
> > Date: Fri, 2 Dec 2022 13:06:08 -0800
> > Subject: [PATCH] error-injection: Split FUNCTION_ERROR_INJECTION into syscalls
> > and the rest.
> >
> > Split FUNCTION_ERROR_INJECTION into:
> > - SYSCALL_ERROR_INJECTION with default y
> > - FUNC_ERROR_INJECTION with default n.
>
> OK, syscall is a bit different, it is clearly the boundary of the
> functionality, so this seems safe.
> IMHO, it is better to extend seccomp framework for testing.
seccomp doesn't support eBPF
> >
> > The former is only used to modify return values of syscalls for security and
> > user space testing reasons while the latter is for the rest of error injection
> > in the kernel that should only be used to stress test and debug the kernel.
> >
> > Signed-off-by: Alexei Starovoitov <ast@...nel.org>
> > ---
> > arch/arm64/include/asm/syscall_wrapper.h | 8 ++++----
> > arch/powerpc/include/asm/syscall_wrapper.h | 4 ++--
> > arch/s390/include/asm/syscall_wrapper.h | 12 ++++++------
> > arch/x86/include/asm/syscall_wrapper.h | 4 ++--
> > include/asm-generic/error-injection.h | 1 +
> > include/linux/compat.h | 4 ++--
> > include/linux/syscalls.h | 4 ++--
> > kernel/fail_function.c | 1 +
> > lib/Kconfig.debug | 15 +++++++++++++++
> > lib/error-inject.c | 6 ++++++
> > 10 files changed, 41 insertions(+), 18 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/syscall_wrapper.h b/arch/arm64/include/asm/syscall_wrapper.h
> > index d30217c21eff..2c5ca239e88c 100644
> > --- a/arch/arm64/include/asm/syscall_wrapper.h
> > +++ b/arch/arm64/include/asm/syscall_wrapper.h
> > @@ -19,7 +19,7 @@
> >
> > #define COMPAT_SYSCALL_DEFINEx(x, name, ...) \
> > asmlinkage long __arm64_compat_sys##name(const struct pt_regs *regs); \
> > - ALLOW_ERROR_INJECTION(__arm64_compat_sys##name, ERRNO); \
> > + ALLOW_ERROR_INJECTION(__arm64_compat_sys##name, SYSCALL); \
>
> But in that case, please don't use ALLOW_ERROR_INJECTION. I don't want to
> mix up the function-level error injection(FEI) and syscall error injection.
Are you suggesting to copy-paste ALLOW_ERROR_INJECTION logic into another
special section, another vmlinux.lds.h hack, copy-paste of lib/error-inject.c ?
Only to have a different name? Sorry, but that makes no sense.
syscalls return errno towards user space,
while the rest of 'error inject' funcs return errno towards the kernel.
Both are quite similar. There is no need to duplicate:
debugfs_create_dir("error_injection", ...
fault_create_debugfs_attr("fail_function", ...
> For this reason, I want to decouple it from the FEI. FEI will be used
> for more kernel internal functions under development (or debugging),
> which can break something because it will forcibly change the code
> behavior and the kernel will be in unexpected state.
There is no 'unexpected state'.
When Josef added BPF_ALLOW_ERROR_INJECTION() in include/linux/bpf.h
we marked several functions in fs/btrfs/ this way.
Later more functions were marked.
The callers of all those functions have to be ready to deal with errors.
If any of the currently marked functions can oops the kernel it's a bug
in the caller and it has to be fixed, because normal execution can
sooner or later return similar error.
Consider ALLOW_ERROR_INJECTION(should_failslab, ERRNO);
That function was specifically added to exercise memory allocation errors.
The bpf error injection mechanism is not the only one that can generate
the errors.
Later you renamed BPF_ALLOW_ERROR_INJECTION into ALLOW_ERROR_INJECTION in
commit 540adea3809f ("error-injection: Separate error-injection from kprobe"),
but the main purpose of "bpf error injection" stayed the same.
We didn't mark random kernel functions as 'inject errors here'.
Only those whose callers must do sane things in case of errors.
So attemp to 'will be used for more kernel internal functions under development'
doesn't fit the spirit for bpf error injection as it is today.
For this kind of random kernel injection please use some other mechanism.
We cannot allow bpf to change return values of random function.
As far as users of this [BPF_]ALLOW_ERROR_INJECTION...
I couldn't find any blog, article or post that is talking about
text interface to tweak return values /sys/kernel/debug/fail_function.
Only links to kernel doc.
But there are plenty of BPF users of error injection. Like:
https://github.com/iovisor/bcc/blob/master/tools/inject_example.txt
https://chaos-mesh.org/docs/simulate-kernel-chaos-on-kubernetes/
https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md#20-override-override-return-value
so we should tailor this 'error injection' facility to actual users
and not hypothetical 'more kernel internal functions under development'.
Powered by blists - more mailing lists