[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7je7Kryipdq6AV4@bombadil.infradead.org>
Date: Fri, 21 Feb 2025 12:15:40 -0800
From: Luis Chamberlain <mcgrof@...nel.org>
To: Lucas De Marchi <lucas.demarchi@...el.com>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Daniel Gomez <da.gomez@...sung.com>,
Petr Pavlu <petr.pavlu@...e.com>,
Sami Tolvanen <samitolvanen@...gle.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>,
Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>,
Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
Nathan Chancellor <nathan@...nel.org>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Bill Wendling <morbo@...gle.com>,
Justin Stitt <justinstitt@...gle.com>,
linux-modules@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
bpf <bpf@...r.kernel.org>, clang-built-linux <llvm@...ts.linux.dev>,
iovisor-dev <iovisor-dev@...ts.iovisor.org>, gost.dev@...sung.com
Subject: Re: [PATCH 2/2] moderr: add module error injection tool
On Wed, Feb 19, 2025 at 02:17:48PM -0600, Lucas De Marchi wrote:
> On Tue, Jan 28, 2025 at 12:57:05PM -0800, Luis Chamberlain wrote:
> > On Wed, Jan 22, 2025 at 09:02:19AM -0800, Alexei Starovoitov wrote:
> > > On Wed, Jan 22, 2025 at 5:12 AM Daniel Gomez <da.gomez@...sung.com> wrote:
> > > >
> > > > Add support for a module error injection tool. The tool
> > > > can inject errors in the annotated module kernel functions
> > > > such as complete_formation(), do_init_module() and
> > > > module_enable_rodata_after_init(). Module name and module function are
> > > > required parameters to have control over the error injection.
> > > >
> > > > Example: Inject error -22 to module_enable_rodata_ro_after_init for
> > > > brd module:
> > > >
> > > > sudo moderr --modname=brd --modfunc=module_enable_rodata_ro_after_init \
> > > > --error=-22 --trace
> > > > Monitoring module error injection... Hit Ctrl-C to end.
> > > > MODULE ERROR FUNCTION
> > > > brd -22 module_enable_rodata_after_init()
> > > >
> > > > Kernel messages:
> > > > [ 89.463690] brd: module loaded
> > > > [ 89.463855] brd: module_enable_rodata_ro_after_init() returned -22,
> > > > ro_after_init data might still be writable
> > > >
> > > > Signed-off-by: Daniel Gomez <da.gomez@...sung.com>
> > > > ---
> > > > tools/bpf/Makefile | 13 ++-
> > > > tools/bpf/moderr/.gitignore | 2 +
> > > > tools/bpf/moderr/Makefile | 95 +++++++++++++++++
> > > > tools/bpf/moderr/moderr.bpf.c | 127 +++++++++++++++++++++++
> > > > tools/bpf/moderr/moderr.c | 236 ++++++++++++++++++++++++++++++++++++++++++
> > > > tools/bpf/moderr/moderr.h | 40 +++++++
> > > > 6 files changed, 510 insertions(+), 3 deletions(-)
> > >
> > > The tool looks useful, but we don't add tools to the kernel repo.
> > > It has to stay out of tree.
> >
> > For selftests we do add random tools.
> >
> > > The value of error injection is not clear to me.
> >
> > It is of great value, since it deals with corner cases which are
> > otherwise hard to reproduce in places which a real error can be
> > catostrophic.
> >
> > > Other places in the kernel use it to test paths in the kernel
> > > that are difficult to do otherwise.
> >
> > Right.
> >
> > > These 3 functions don't seem to be in this category.
> >
> > That's the key here we should focus on. The problem is when a maintainer
> > *does* agree that adding an error injection entry is useful for testing,
> > and we have a developer willing to do the work to help test / validate
> > it. In this case, this error case is rare but we do want to strive to
> > test this as we ramp up and extend our modules selftests.
> >
> > Then there is the aspect of how to mitigate how instrusive code changes
> > to allow error injection are. In 2021 we evaluated the prospect of error
> > injection in-kernel long ago for other areas like the block layer for
> > add_disk() failures [0] but the minimal interface to enable this from
> > userspace with debugfs was considered just too intrusive.
> >
> > This effort tried to evaluate what this could look like with eBPF to
> > mitigate the required in-kernel code, and I believe the light weight
> > nature of it by just requiring a sprinkle with ALLOW_ERROR_INJECTION()
> > suffices to my taste.
> >
> > So, perhaps the tools aspect can just go in:
> >
> > tools/testing/selftests/module/
>
> but why would it be module-specific?
Gotta start somewhere.
> Based on its current implementation
> and discussion about inject.py it seems to be generic enough to be
> useful to test any function annotated with ALLOW_ERROR_INJECTION().
>
> As xe driver maintainer, it may be interesting to use such a tool:
>
> $ git grep ALLOW_ERROR_INJECT -- drivers/gpu/drm/xe | wc -l 23
>
> How does this approach compare to writing the function name on debugfs
> (the current approach in xe's testsuite)?
>
> fail_function @ https://docs.kernel.org/fault-injection/fault-injection.html#fault-injection-capabilities-infrastructure
> https://gitlab.freedesktop.org/drm/igt-gpu-tools/-/blob/master/tests/intel/xe_fault_injection.c?ref_type=heads#L108
>
> If you decide to have the tool to live somewhere else, then kmod repo
> could be a candidate.
Would we install this upon install target?
Danny can decide on this :)
> Although I think having it in kernel tree is
> simpler maintenance-wise.
I think we have at least two users upstream who can make use of it. If
we end up going through tools/testing/selftests/module/ first, can't
you make use of it later?
Luis
Powered by blists - more mailing lists