[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240620-fault-injection-statickeys-v2-0-e23947d3d84b@suse.cz>
Date: Thu, 20 Jun 2024 00:48:54 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Akinobu Mita <akinobu.mita@...il.com>, Christoph Lameter <cl@...ux.com>,
David Rientjes <rientjes@...gle.com>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>,
"Naveen N. Rao" <naveen.n.rao@...ux.ibm.com>,
Anil S Keshavamurthy <anil.s.keshavamurthy@...el.com>,
"David S. Miller" <davem@...emloft.net>,
Masami Hiramatsu <mhiramat@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>, Mark Rutland <mark.rutland@....com>
Cc: Jiri Olsa <jolsa@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
Hyeonggon Yoo <42.hyeyoo@...il.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, bpf@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
Vlastimil Babka <vbabka@...e.cz>
Subject: [PATCH v2 0/7] static key support for error injection functions
This should now be complete, but perf_events attached bpf programs are
untested (Patch 3).
This is spread accross several subsystems but the easiest way would be
to go through a single tree, such as the mm tree.
As previously mentioned by myself [1] and others [2] the functions
designed for error injection can bring visible overhead in fastpaths
such as slab or page allocation, because even if nothing hooks into them
at a given moment, they are noninline function calls regardless of
CONFIG_ options since commits 4f6923fbb352 ("mm: make should_failslab
always available for fault injection") and af3b854492f3
("mm/page_alloc.c: allow error injection").
Live patching their callsites has been also suggested in both [1] and
[2] threads, and this is an attempt to do that with static keys that
guard the call sites. When disabled, the error injection functions still
exist and are noinline, but are not being called. Any of the existing
mechanisms that can inject errors should make sure to enable the
respective static key. I have added that support to hopefully all of
them that can be used today.
- the legacy fault injection, i.e. CONFIG_FAILSLAB and
CONFIG_FAIL_PAGE_ALLOC is handled in Patch 1, and can be passed the
address of the static key if it exists. The key will be activated if the
fault injection probability becomes non-zero, and deactivated in the
opposite transition. This also removes the overhead of the evaluation
(on top of the noninline function call) when these mechanisms are
configured in the kernel but unused at the moment.
- the generic error injection using kretprobes with
override_function_with_return is handled in Patch 2. The
ALLOW_ERROR_INJECTION() annotation is extended so that static key
address can be passed, and the framework controls it when error
injection is enabled or disabled in debugfs for the function.
- bpf programs can override return values of probed functions with
CONFIG_BPF_KPROBE_OVERRIDE=y and have prog->kprobe_override=1. They
can be attached to perf_event, which is handled in Patch 3, or via
multi_link_attach, which is handled in Patch 4. I have tested the
latter using a modified bcc program from commit 4f6923fbb352
description, but not Patch 3 using a perf_event - testing is welcome.
- ftrace seems to be using override_function_with_return from
#define ftrace_override_function_with_return but there appear to be
no users, which was confirmed by Mark Rutland in the RFC thread.
If anyone was crazy enough to use multiple of mechanisms above
simultaneously, the usage of static_key_slow_inc/dec will do the right
thing and the key will be enabled iff at least one mechanism is active.
Additionally to the static key support, Patch 5 makes it possible to
stop making the fault injection functions noninline with
CONFIG_FUNCTION_ERROR_INJECTION=n by compiling out the BTF_ID()
references for bpf_non_sleepable_error_inject which are unnecessary in
that case.
Patches 6 and 7 implement the static keys for the two mm fault injection
sites in slab and page allocators. I have measured the improvement for
the slab case, as described in Patch 6:
To demonstrate the reduced overhead of calling an empty
should_failslab() function, a kernel build with
CONFIG_FUNCTION_ERROR_INJECTION enabled but CONFIG_FAILSLAB disabled,
and CPU mitigations enabled, was used in a qemu-kvm (virtme-ng) on AMD
Ryzen 7 2700 machine, and execution of a program trying to open() a
non-existent file was measured 3 times:
for (int i = 0; i < 10000000; i++) {
open("non_existent", O_RDONLY);
}
After this patch, the measured real time was 4.3% smaller. Using perf
profiling it was verified that should_failslab was gone from the
profile.
With CONFIG_FAILSLAB also enabled, the patched kernel performace was
unaffected, as expected, while unpatched kernel's performance was worse,
resulting in the relative speedup being 10.5%. This means it no longer
needs to be an option suitable only for debug kernel builds.
There might be other such fault injection callsites in hotpaths of other
subsystems but I didn't search for them at this point. With all the
preparations in place, it should be simple to improve them now.
FAQ:
Q: Does this improve only config options nobody uses in production
anyway?
A: No, the error injection hooks are unconditionally noninline functions
even if they are empty. CONFIG_FUNCTION_ERROR_INJECTION=y is probably
rather common, and overrides done via bpf. The goal was to eliminate
this unnecessary overhead. But as a secondary benefit now the legacy
fault injection options can be also enabled in production kernels
without extra overhead.
Q: Should we remove the legacy fault injection framework?
A: Maybe? I didn't want to wait for that to happen, so it's just handled
as well (Patch 1). The generic error injection handling and bpf needed
the most effort anyway.
Q: Should there be a unified way to register the kprobes that override
return values, that would also handle the static key control?
A: Possibly, but I'm not familiar with the area enough to do that. I
found every case handled by patches 2-4 to be so different, I just
modified them all. If a unification comes later, it should not change
most of what's done by this patchset.
[1] https://lore.kernel.org/6d5bb852-8703-4abf-a52b-90816bccbd7f@suse.cz/
[2] https://lore.kernel.org/3j5d3p22ssv7xoaghzraa7crcfih3h2qqjlhmjppbp6f42pg2t@kg7qoicog5ye/
Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
---
Changes in v2:
- Add error injection static key control for bpf programs with
kprobe_override.
- Add separate get_injection_key() for querying (Masami Hiramatsu)
- Compile everything out with CONFIG_FUNCTION_ERROR_INJECTION=n
- Link to v1: https://lore.kernel.org/r/20240531-fault-injection-statickeys-v1-0-a513fd0a9614@suse.cz
---
Vlastimil Babka (7):
fault-inject: add support for static keys around fault injection sites
error-injection: support static keys around injectable functions
bpf: support error injection static keys for perf_event attached progs
bpf: support error injection static keys for multi_link attached progs
bpf: do not create bpf_non_sleepable_error_inject list when unnecessary
mm, slab: add static key for should_failslab()
mm, page_alloc: add static key for should_fail_alloc_page()
include/asm-generic/error-injection.h | 13 ++++++-
include/asm-generic/vmlinux.lds.h | 2 +-
include/linux/error-injection.h | 12 +++++--
include/linux/fault-inject.h | 14 ++++++--
kernel/bpf/verifier.c | 15 ++++++++
kernel/fail_function.c | 10 ++++++
kernel/trace/bpf_trace.c | 65 +++++++++++++++++++++++++++++++----
kernel/trace/trace_kprobe.c | 30 ++++++++++++++--
kernel/trace/trace_probe.h | 5 +++
lib/error-inject.c | 19 ++++++++++
lib/fault-inject.c | 43 ++++++++++++++++++++++-
mm/fail_page_alloc.c | 3 +-
mm/failslab.c | 2 +-
mm/internal.h | 2 ++
mm/page_alloc.c | 30 ++++++++++++++--
mm/slab.h | 3 ++
mm/slub.c | 30 ++++++++++++++--
17 files changed, 274 insertions(+), 24 deletions(-)
---
base-commit: 1613e604df0cd359cf2a7fbd9be7a0bcfacfabd0
change-id: 20240530-fault-injection-statickeys-66b7222e91b7
Best regards,
--
Vlastimil Babka <vbabka@...e.cz>
Powered by blists - more mailing lists