lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9edd1395-8651-446b-b056-9428076cd830@linux.dev>
Date: Tue, 23 Dec 2025 21:36:22 -0800
From: Yonghong Song <yonghong.song@...ux.dev>
To: Ihor Solodrai <ihor.solodrai@...ux.dev>,
 Luis Chamberlain <mcgrof@...nel.org>, Petr Pavlu <petr.pavlu@...e.com>,
 Daniel Gomez <da.gomez@...nel.org>, Nathan Chancellor <nathan@...nel.org>,
 Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
 Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
 <martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>
Cc: linux-kernel@...r.kernel.org, linux-modules@...r.kernel.org,
 bpf@...r.kernel.org, linux-kbuild@...r.kernel.org, llvm@...ts.linux.dev
Subject: Re: [RFC PATCH v1] module: Fix kernel panic when a symbol st_shndx is
 out of bounds



On 12/23/25 4:57 PM, Ihor Solodrai wrote:
> I've been chasing down the following flaky splat, introduced by recent
> changes in BTF generation [1]:
>
>    ------------[ cut here ]------------
>    BUG: unable to handle page fault for address: ffa000000233d828
>    #PF: supervisor read access in kernel mode
>    #PF: error_code(0x0000) - not-present page
>    PGD 100000067 P4D 100253067 PUD 100258067 PMD 0
>    Oops: Oops: 0000 [#1] SMP NOPTI
>    CPU: 1 UID: 0 PID: 390 Comm: test_progs Tainted: G        W  OE       6.19.0-rc1-gf785a31395d9 #331 PREEMPT(full)
>    Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
>    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.el9 04/01/2014
>    RIP: 0010:simplify_symbols+0x2b2/0x480
>       9.737179] Code: 85 f6 4d 89 f7 b8 01 00 00 00 4c 0f 44 f8 49 83 fd f0 4d 0f 44 fe 75 5b 4d 85 ff 0f 85 76 ff ff ff eb 50 49 8b 4e 20 c1 e0 06 <48> 8b 44 01 10 9 cf fd ff ff 49 89 c5 eb 36 49 c7 c5
>    RSP: 0018:ffa00000017afc40 EFLAGS: 00010216
>    RAX: 00000000003fffc0 RBX: 0000000000000002 RCX: ffa0000001f3d858
>    RDX: ffffffffc0218038 RSI: ffffffffc0218008 RDI: aaaaaaaaaaaaaaab
>    RBP: ffa00000017afd18 R08: 0000000000000072 R09: 0000000000000069
>    R10: ffffffff8160d6ca R11: 0000000000000000 R12: ffa0000001f3d577
>    R13: ffffffffc0214058 R14: ffa00000017afdc0 R15: ffa0000001f3e518
>    FS:  00007f1c638654c0(0000) GS:ff1100089b7bc000(0000) knlGS:0000000000000000
>    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    CR2: ffa000000233d828 CR3: 000000010ba1f001 CR4: 0000000000771ef0
>    PKRU: 55555554
>    Call Trace:
>     <TASK>
>     ? __kmalloc_node_track_caller_noprof+0x37f/0x740
>     ? __pfx_setup_modinfo_srcversion+0x10/0x10
>     ? srso_alias_return_thunk+0x5/0xfbef5
>     ? kstrdup+0x4a/0x70
>     ? srso_alias_return_thunk+0x5/0xfbef5
>     ? setup_modinfo_srcversion+0x1a/0x30
>     ? srso_alias_return_thunk+0x5/0xfbef5
>     ? setup_modinfo+0x12b/0x1e0
>     load_module+0x133a/0x1610
>     __x64_sys_finit_module+0x31b/0x450
>     ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
>     do_syscall_64+0x80/0x2d0
>     ? srso_alias_return_thunk+0x5/0xfbef5
>     ? exc_page_fault+0x95/0xc0
>     entry_SYSCALL_64_after_hwframe+0x76/0x7e
>    RIP: 0033:0x7f1c63a2582d
>       9.794028] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff 8 8b 0d bb 15 0f 00 f7 d8 64 89 01 48
>    RSP: 002b:00007ffe513df128 EFLAGS: 00000206 ORIG_RAX: 0000000000000139
>    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1c63a2582d
>    RDX: 0000000000000000 RSI: 0000000000ee83f9 RDI: 0000000000000016
>    RBP: 00007ffe513df150 R08: 0000000000000000 R09: 0000000000000000
>    R10: 0000000000000000 R11: 0000000000000206 R12: 00007ffe513e3588
>    R13: 000000000088fad0 R14: 00000000014bddb0 R15: 00007f1c63ba7000
>     </TASK>
>    Modules linked in: bpf_testmod(OE)
>    CR2: ffa000000233d828
>    ---[ end trace 0000000000000000 ]---
>    RIP: 0010:simplify_symbols+0x2b2/0x480
>       9.821595] Code: 85 f6 4d 89 f7 b8 01 00 00 00 4c 0f 44 f8 49 83 fd f0 4d 0f 44 fe 75 5b 4d 85 ff 0f 85 76 ff ff ff eb 50 49 8b 4e 20 c1 e0 06 <48> 8b 44 01 10 9 cf fd ff ff 49 89 c5 eb 36 49 c7 c5
>    RSP: 0018:ffa00000017afc40 EFLAGS: 00010216
>    RAX: 00000000003fffc0 RBX: 0000000000000002 RCX: ffa0000001f3d858
>    RDX: ffffffffc0218038 RSI: ffffffffc0218008 RDI: aaaaaaaaaaaaaaab
>    RBP: ffa00000017afd18 R08: 0000000000000072 R09: 0000000000000069
>    R10: ffffffff8160d6ca R11: 0000000000000000 R12: ffa0000001f3d577
>    R13: ffffffffc0214058 R14: ffa00000017afdc0 R15: ffa0000001f3e518
>    FS:  00007f1c638654c0(0000) GS:ff1100089b7bc000(0000) knlGS:0000000000000000
>    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>    CR2: ffa000000233d828 CR3: 000000010ba1f001 CR4: 0000000000771ef0
>    PKRU: 55555554
>    Kernel panic - not syncing: Fatal exception
>    Kernel Offset: disabled
>
> This hasn't happened on BPF CI so far, for example, however I was able
> to reproduce it on a particular x64 machine using a kernel built with
> LLVM 20.
>
> The crash happens on attempt to load one of the BPF selftest modules
> (tools/testing/selftests/bpf/test_kmods/bpf_test_modorder_x.ko) which
> is used by kfunc_module_order test.
>
> The reason for the crash is that simplify_symbols() doesn't check for
> bounds of the ELF section index:
>
>         for (i = 1; i < symsec->sh_size / sizeof(Elf_Sym); i++) {
> 		const char *name = info->strtab + sym[i].st_name;
>
> 		switch (sym[i].st_shndx) {
> 		case SHN_COMMON:
>
> 		[...]
>
> 		default:
> 			/* Divert to percpu allocation if a percpu var. */
> 			if (sym[i].st_shndx == info->index.pcpu)
> 				secbase = (unsigned long)mod_percpu(mod);
> 			else
>    /** HERE --> **/		secbase = info->sechdrs[sym[i].st_shndx].sh_addr;
> 			sym[i].st_value += secbase;
> 			break;
> 		}
> 	}
>
> And in the case I was able to reproduce, the value 0xffff
> (SHN_HIRESERVE aka SHN_XINDEX [2]) fell through here.
>
> Now this code fragment is between 15 and 20 years old, so obviously
> it's not expected for a kmodule symbol to have such st_shndx
> value. Even so, the kernel probably should fail loading the module
> instead of crashing, which is what this patch attempts to fix.
>
> Investigating further, I discovered that the module binary became
> corrupted by `${OBJCOPY} --update-section` operation updating .BTF_ids
> section data in scripts/gen-btf.sh. This explains how the bug has
> surfaced after gen-btf.sh was introduced:
>
>    $ llvm-readelf -s --wide bpf_test_modorder_x.ko | grep 'BTF_ID'
>    llvm-readelf: warning: 'bpf_test_modorder_x.ko': found an extended symbol index (2), but unable to locate the extended symbol index table
>    llvm-readelf: warning: 'bpf_test_modorder_x.ko': found an extended symbol index (3), but unable to locate the extended symbol index table
>    llvm-readelf: warning: 'bpf_test_modorder_x.ko': found an extended symbol index (4), but unable to locate the extended symbol index table
>         3: 0000000000000000    16 NOTYPE  LOCAL  DEFAULT   RSV[0xffff] __BTF_ID__set8__bpf_test_modorder_kfunc_x_ids
>    llvm-readelf: warning: 'bpf_test_modorder_x.ko': found an extended symbol index (16), but unable to locate the extended symbol index table
>         4: 0000000000000008     4 OBJECT  LOCAL  DEFAULT   RSV[0xffff] __BTF_ID__func__bpf_test_modorder_retx__44417
>
> vs expected
>
>    $ llvm-readelf -s --wide bpf_test_modorder_x.ko | grep 'BTF_ID'
>         3: 0000000000000000    16 NOTYPE  LOCAL  DEFAULT     6 __BTF_ID__set8__bpf_test_modorder_kfunc_x_ids
>         4: 0000000000000008     4 OBJECT  LOCAL  DEFAULT     6 __BTF_ID__func__bpf_test_modorder_retx__44417
>
> But why? Updating section data without changing it's size is not
> supposed to affect sections indices, right?
>
> With a bit more testing I confirmed that this is a LLVM-specific
> issue (doesn't reproduce with GCC kbuild), and it's not stable,
> because in link-vmlinux.h we also do:
>
>      ${OBJCOPY} --update-section .BTF_ids=${btfids_vmlinux} ${VMLINUX}
>
> However:
>
>    $ llvm-readelf -s --wide ~/workspace/prog-aux/linux/vmlinux | grep 0xffff
>    # no output, which is good
>
> So the suspect is the implementation of llvm-objcopy. As it turns out
> there is a relevant known bug that explains the flakiness and isn't
> fixed yet [3].
>
> [1] https://lore.kernel.org/bpf/20251219181825.1289460-3-ihor.solodrai@linux.dev/
> [2] https://man7.org/linux/man-pages/man5/elf.5.html
> [3] https://github.com/llvm/llvm-project/issues/168060#issuecomment-3533552952
>
> Signed-off-by: Ihor Solodrai <ihor.solodrai@...ux.dev>
>
> ---
>
> RFC
>
> While this llvm-objcopy bug is not fixed, we can not trust it in the
> kernel build pipeline. In the short-term we have to come up with a
> workaround for .BTF_ids section update and replace the calls to
> ${OBJCOPY} --update-section with something else.
>
> One potential workaround is to force the use of the objcopy (from
> binutils) instead of llvm-objcopy when updating .BTF_ids section.
>
> Alternatively, we could just dd the .BTF_ids data computed by
> resolve_btfids at the right offset in the target ELF file.
>
> Surprisingly I couldn't find a good way to read a section offset and
> size from the ELF with a specified format in a command line. Both
> readelf and {llvm-}objdump give a human readable output, and it
> appears we can't rely on the column order, for example.
>
> We could still try parsing readelf output with awk/grep, covering
> output variants that appear in the kernel build.
>
> We can also do:
>
>     llvm-readobj --elf-output-style=JSON --sections "$elf" | \
>          jq -r --arg name .BTF_ids '
>              .[0].Sections[] |
>              select(.Section.Name.Name == $name) |
>              "\(.Section.Offset) \(.Section.Size)"'
>
> ...but idk man, doesn't feel right.
>
> Most reliable way to determine the size and offset of .BTF_ids section
> is probably reading them by a C program with libelf, such as
> resolve_btfids. Which is quite ironic, given the recent
> changes. Setting the irony aside, we could add smth like:
>           resolve_btfids --section-info=.BTF_ids $elf
>
> Reverting the gen-btf.sh patch is also a possible workaround, but I'd
> really like to avoid it, given that BPF features/optimizations in
> development depend on it.
>
> I'd appreciate comments and suggestions on this issue. Thank you!
> ---
>   kernel/module/main.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/kernel/module/main.c b/kernel/module/main.c
> index 710ee30b3bea..5bf456fad63e 100644
> --- a/kernel/module/main.c
> +++ b/kernel/module/main.c
> @@ -1568,6 +1568,13 @@ static int simplify_symbols(struct module *mod, const struct load_info *info)
>   			break;
>   
>   		default:
> +			if (sym[i].st_shndx >= info->hdr->e_shnum) {
> +				pr_err("%s: Symbol %s has an invalid section index %u (max %u)\n",
> +				       mod->name, name, sym[i].st_shndx, info->hdr->e_shnum - 1);
> +				ret = -ENOEXEC;
> +				break;
> +			}
> +
>   			/* Divert to percpu allocation if a percpu var. */
>   			if (sym[i].st_shndx == info->index.pcpu)
>   				secbase = (unsigned long)mod_percpu(mod);

I tried both llvm21 and llvm22 (where llvm21 is used in bpf ci).

Without KASAN, I can reproduce the failure for llvm19/llvm21/llvm22.
I did not test llvm20 and I assume it may fail too.

The following llvm patch
    https://github.com/llvm/llvm-project/pull/170462
can fix the issue. Currently it is still in review stage. The actual diff is

diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.cpp b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
index e5de17e093df..cc1527d996e2 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
@@ -2168,7 +2168,11 @@ Error Object::updateSectionData(SecPtr &Sec, ArrayRef<uint8_t> Data) {
                               Data.size(), Sec->Name.c_str(), Sec->Size);
  
    if (!Sec->ParentSegment) {
-    Sec = std::make_unique<OwnedDataSection>(*Sec, Data);
+    SectionBase *Replaced = Sec.get();
+    SectionBase *Modified = &addSection<OwnedDataSection>(*Sec, Data);
+    DenseMap<SectionBase *, SectionBase *> Replacements{{Replaced, Modified}};
+    if (auto err = replaceSections(Replacements))
+      return err;
    } else {
      // The segment writer will be in charge of updating these contents.
      Sec->Size = Data.size();

I applied the above patch to latest llvm21 and llvm22 and
the crash is gone and the selftests can run properly.

With KASAN, everything is okay for llvm21 and llvm22.

Not sure whether the llvm patch
    https://github.com/llvm/llvm-project/pull/170462
can make into llvm21 or not as looks like llvm21 intends to
freeze for now. See
    https://github.com/llvm/llvm-project/pull/168314#issuecomment-3645797175
the llvm22 will branch into rc mode in January.

I will try to see whether we can have a reasonable workaround
for llvm21 llvm-objcopy (for without KASAN).


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ