[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAADnVQ+X-a92LEgcd-HjTJUcw2zR_jtUmD9U-Z6OtNnvpVwfiw@mail.gmail.com>
Date: Mon, 29 Dec 2025 16:50:17 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Ihor Solodrai <ihor.solodrai@...ux.dev>
Cc: Nathan Chancellor <nathan@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>,
Luis Chamberlain <mcgrof@...nel.org>, Petr Pavlu <petr.pavlu@...e.com>,
Daniel Gomez <da.gomez@...nel.org>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>,
LKML <linux-kernel@...r.kernel.org>, linux-modules@...r.kernel.org,
bpf <bpf@...r.kernel.org>,
Linux Kbuild mailing list <linux-kbuild@...r.kernel.org>, clang-built-linux <llvm@...ts.linux.dev>
Subject: Re: [RFC PATCH v1] module: Fix kernel panic when a symbol st_shndx is
out of bounds
On Mon, Dec 29, 2025 at 4:39 PM Ihor Solodrai <ihor.solodrai@...ux.dev> wrote:
>
> On 12/29/25 1:29 PM, Nathan Chancellor wrote:
> > Hi Ihor,
> >
> > On Mon, Dec 29, 2025 at 12:40:10PM -0800, Ihor Solodrai wrote:
> >> I think the simplest workaround is this one: use objcopy from binutils
> >> instead of llvm-objcopy when doing --update-section.
> >>
> >> There are just 3 places where that happens, so the OBJCOPY
> >> substitution is going to be localized.
> >>
> >> Also binutils is a documented requirement for compiling the kernel,
> >> whether with clang or not [1].
> >>
> >> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/changes.rst?h=v6.18#n29
> >
> > This would necessitate always specifying a CROSS_COMPILE variable when
> > cross compiling with LLVM=1, which I would really like to avoid. The
> > LLVM variants have generally been drop in substitutes for several
> > versions now so some groups such as Android may not even have GNU
> > binutils installed in their build environment (see a recent build
> > fix [1]).
> >
> > I would much prefer detecting llvm-objcopy in Kconfig (such as by
> > creating CONFIG_OBJCOPY_IS_LLVM using the existing check for
> > llvm-objcopy in X86_X32_ABI in arch/x86/Kconfig) and requiring a working
> > copy (>= 22.0.0 presuming the fix is soon merged) or an explicit opt
> > into GNU objcopy via OBJCOPY=...objcopy for CONFIG_DEBUG_INFO_BTF to be
> > selectable.
>
> I like the idea of opt into GNU objcopy, however I think we should
> avoid requiring kbuilds that want CONFIG_DEBUG_INFO_BTF to change any
> configuration (such as adding an explicit OBJCOPY= in a build command).
>
> I drafted a patch (pasted below), introducing BTF_OBJCOPY which
> defaults to GNU objcopy. This implements the workaround, and should be
> easy to update with a LLVM version check later after the bug is fixed.
>
> This bit:
>
> @@ -391,6 +391,7 @@ config DEBUG_INFO_BTF
> depends on PAHOLE_VERSION >= 122
> # pahole uses elfutils, which does not have support for Hexagon relocations
> depends on !HEXAGON
> + depends on $(success,command -v $(BTF_OBJCOPY))
>
> Will turn off DEBUG_INFO_BTF if relevant GNU objcopy happens to not be
> installed.
>
> However I am not sure this is the right way to fail here. Because if
> the kernel really does need BTF (which is effectively all kernels
> using BPF), then we are breaking them anyways just downstream of the
> build.
>
> An "objcopy: command not found" might make some pipelines red, but it
> is very clear how to address.
>
> Thoughts?
>
>
> From 7c3b9cce97cc76d0365d8948b1ca36c61faddde3 Mon Sep 17 00:00:00 2001
> From: Ihor Solodrai <ihor.solodrai@...ux.dev>
> Date: Mon, 29 Dec 2025 15:49:51 -0800
> Subject: [PATCH] BTF_OBJCOPY
>
> ---
> Makefile | 6 +++++-
> lib/Kconfig.debug | 1 +
> scripts/gen-btf.sh | 10 +++++-----
> scripts/link-vmlinux.sh | 2 +-
> tools/testing/selftests/bpf/Makefile | 4 ++--
> 5 files changed, 14 insertions(+), 9 deletions(-)
All the makefile hackery looks like overkill and wrong direction.
What's wrong with kernel/module/main.c change?
Module loading already does a bunch of sanity checks for ELF
in elf_validity_cache_copy().
+ if (sym[i].st_shndx >= info->hdr->e_shnum)
is just one more.
Maybe it can be moved to elf_validity*() somewhere,
but that's a minor detail.
iiuc llvm-objcopy affects only bpf testmod, so not a general
issue that needs top level makefile changes.
Powered by blists - more mailing lists