lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <af906e9e-8f94-41f5-9100-1a3b4526e220@linux.dev>
Date: Mon, 29 Dec 2025 12:40:10 -0800
From: Ihor Solodrai <ihor.solodrai@...ux.dev>
To: Yonghong Song <yonghong.song@...ux.dev>,
 Luis Chamberlain <mcgrof@...nel.org>, Petr Pavlu <petr.pavlu@...e.com>,
 Daniel Gomez <da.gomez@...nel.org>, Nathan Chancellor <nathan@...nel.org>,
 Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
 Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
 <martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>
Cc: linux-kernel@...r.kernel.org, linux-modules@...r.kernel.org,
 bpf@...r.kernel.org, linux-kbuild@...r.kernel.org, llvm@...ts.linux.dev
Subject: Re: [RFC PATCH v1] module: Fix kernel panic when a symbol st_shndx is
 out of bounds

On 12/23/25 9:36 PM, Yonghong Song wrote:
> 
> 
> On 12/23/25 4:57 PM, Ihor Solodrai wrote:
>> [...]
>>
>> While this llvm-objcopy bug is not fixed, we can not trust it in the
>> kernel build pipeline. In the short-term we have to come up with a
>> workaround for .BTF_ids section update and replace the calls to
>> ${OBJCOPY} --update-section with something else.
>>
>> One potential workaround is to force the use of the objcopy (from
>> binutils) instead of llvm-objcopy when updating .BTF_ids section.

I think the simplest workaround is this one: use objcopy from binutils
instead of llvm-objcopy when doing --update-section.

There are just 3 places where that happens, so the OBJCOPY
substitution is going to be localized.

Also binutils is a documented requirement for compiling the kernel,
whether with clang or not [1].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/changes.rst?h=v6.18#n29

>>
>> Alternatively, we could just dd the .BTF_ids data computed by
>> resolve_btfids at the right offset in the target ELF file.
>>
>> Surprisingly I couldn't find a good way to read a section offset and
>> size from the ELF with a specified format in a command line. Both
>> readelf and {llvm-}objdump give a human readable output, and it
>> appears we can't rely on the column order, for example.
>>
>> We could still try parsing readelf output with awk/grep, covering
>> output variants that appear in the kernel build.
>>
>> We can also do:
>>
>>     llvm-readobj --elf-output-style=JSON --sections "$elf" | \
>>          jq -r --arg name .BTF_ids '
>>              .[0].Sections[] |
>>              select(.Section.Name.Name == $name) |
>>              "\(.Section.Offset) \(.Section.Size)"'
>>
>> ...but idk man, doesn't feel right.
>>
>> Most reliable way to determine the size and offset of .BTF_ids section
>> is probably reading them by a C program with libelf, such as
>> resolve_btfids. Which is quite ironic, given the recent
>> changes. Setting the irony aside, we could add smth like:
>>           resolve_btfids --section-info=.BTF_ids $elf
>>
>> Reverting the gen-btf.sh patch is also a possible workaround, but I'd
>> really like to avoid it, given that BPF features/optimizations in
>> development depend on it.
>>
>> I'd appreciate comments and suggestions on this issue. Thank you!
>> ---
>>   kernel/module/main.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/kernel/module/main.c b/kernel/module/main.c
>> index 710ee30b3bea..5bf456fad63e 100644
>> --- a/kernel/module/main.c
>> +++ b/kernel/module/main.c
>> @@ -1568,6 +1568,13 @@ static int simplify_symbols(struct module *mod, const struct load_info *info)
>>               break;
>>             default:
>> +            if (sym[i].st_shndx >= info->hdr->e_shnum) {
>> +                pr_err("%s: Symbol %s has an invalid section index %u (max %u)\n",
>> +                       mod->name, name, sym[i].st_shndx, info->hdr->e_shnum - 1);
>> +                ret = -ENOEXEC;
>> +                break;
>> +            }
>> +
>>               /* Divert to percpu allocation if a percpu var. */
>>               if (sym[i].st_shndx == info->index.pcpu)
>>                   secbase = (unsigned long)mod_percpu(mod);
> 
> I tried both llvm21 and llvm22 (where llvm21 is used in bpf ci).
> 
> Without KASAN, I can reproduce the failure for llvm19/llvm21/llvm22.
> I did not test llvm20 and I assume it may fail too.
> 
> The following llvm patch
>    https://github.com/llvm/llvm-project/pull/170462
> can fix the issue. Currently it is still in review stage. The actual diff is
> 
> diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.cpp b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
> index e5de17e093df..cc1527d996e2 100644
> --- a/llvm/lib/ObjCopy/ELF/ELFObject.cpp
> +++ b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
> @@ -2168,7 +2168,11 @@ Error Object::updateSectionData(SecPtr &Sec, ArrayRef<uint8_t> Data) {
>                               Data.size(), Sec->Name.c_str(), Sec->Size);
>  
>    if (!Sec->ParentSegment) {
> -    Sec = std::make_unique<OwnedDataSection>(*Sec, Data);
> +    SectionBase *Replaced = Sec.get();
> +    SectionBase *Modified = &addSection<OwnedDataSection>(*Sec, Data);
> +    DenseMap<SectionBase *, SectionBase *> Replacements{{Replaced, Modified}};
> +    if (auto err = replaceSections(Replacements))
> +      return err;
>    } else {
>      // The segment writer will be in charge of updating these contents.
>      Sec->Size = Data.size();
> 
> I applied the above patch to latest llvm21 and llvm22 and
> the crash is gone and the selftests can run properly.

Hi Yonghong, thank you for confirming the issue.

Patching llvm-objcopy would be great, it should be done. But we are
still going to be stuck with making sure older LLVMs can build the kernel.
So even if they backport the fix to v21, it won't help us much, unfortunately.

> 
> With KASAN, everything is okay for llvm21 and llvm22.
> 
> Not sure whether the llvm patch
>    https://github.com/llvm/llvm-project/pull/170462
> can make into llvm21 or not as looks like llvm21 intends to
> freeze for now. See
>    https://github.com/llvm/llvm-project/pull/168314#issuecomment-3645797175
> the llvm22 will branch into rc mode in January.
> 
> I will try to see whether we can have a reasonable workaround
> for llvm21 llvm-objcopy (for without KASAN).
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ