lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <90b3b613-8665-425b-8132-5b9ac86ab616@oracle.com>
Date: Thu, 7 Nov 2024 15:05:56 +0000
From: Alan Maguire <alan.maguire@...cle.com>
To: Laura Nao <laura.nao@...labora.com>, regressions@...ts.linux.dev
Cc: linux-kernel@...r.kernel.org, kernel@...labora.com, bpf@...r.kernel.org,
        chrome-platform@...ts.linux.dev
Subject: Re: [REGRESSION] module BTF validation failure (Error -22) on next

On 06/11/2024 16:08, Laura Nao wrote:
> Hello,
> 
> KernelCI has detected a module loading regression affecting all AMD and 
> Intel Chromebooks in the Collabora LAVA lab, occurring between 
> next-20241024 and next-20241025.
> 
> The logs indicate a failure in BTF module validation, preventing all 
> modules from loading correctly (with CONFIG_MODULE_ALLOW_BTF_MISMATCH 
> unset). The example below is from an AMD Chromebook (HP 14b na0052xx), 
> with similar errors observed on other AMD and Intel devices:
> 
> [    5.284373] failed to validate module [cros_kbd_led_backlight] BTF: -22
> [    5.291392] failed to validate module [i2c_hid] BTF: -22
> [    5.293958] failed to validate module [chromeos_pstore] BTF: -22
> [    5.302832] failed to validate module [coreboot_table] BTF: -22
> [    5.309175] failed to validate module [raydium_i2c_ts] BTF: -22
> [    5.309264] failed to validate module [i2c_cros_ec_tunnel] BTF: -22
> [    5.322158] failed to validate module [typec] BTF: -22
> [    5.327554] failed to validate module [snd_timer] BTF: -22
> [    5.327573] failed to validate module [cros_usbpd_notify] BTF: -22
> [    5.339272] failed to validate module [elan_i2c] BTF: -22
> [    5.345821] failed to validate module [industrialio] BTF: -22
> [    5.423113] failed to validate module [cfg80211] BTF: -22
> [    5.443074] failed to validate module [cros_ec_dev] BTF: -22
> [    5.448857] failed to validate module [snd_pci_acp3x] BTF: -22
> [    5.454736] failed to validate module [cros_kbd_led_backlight] BTF: -22
> [    5.461458] failed to validate module [regmap_i2c] BTF: -22
> [    5.470228] failed to validate module [i2c_piix4] BTF: -22
> [    5.491123] failed to validate module [i2c_hid] BTF: -22
> [    5.491226] failed to validate module [chromeos_pstore] BTF: -22
> [    5.496519] failed to validate module [coreboot_table] BTF: -22
> [    5.502632] failed to validate module [snd_timer] BTF: -22
> [    5.538916] failed to validate module [gsmi] BTF: -22
> [    5.604971] failed to validate module [mii] BTF: -22
> [    5.604971] failed to validate module [videobuf2_common] BTF: -22
> [    5.604972] failed to validate module [sp5100_tco] BTF: -22
> [    5.616068] failed to validate module [snd_soc_acpi] BTF: -22
> [    5.680553] failed to validate module [bluetooth] BTF: -22
> [    5.749320] failed to validate module [chromeos_pstore] BTF: -22
> [    5.755440] failed to validate module [mii] BTF: -22
> [    5.760522] failed to validate module [snd_timer] BTF: -22
> [    5.783549] failed to validate module [bluetooth] BTF: -22
> [    5.841561] failed to validate module [mii] BTF: -22
> [    5.846699] failed to validate module [snd_timer] BTF: -22
> [    5.892444] failed to validate module [mii] BTF: -22
> [    5.897708] failed to validate module [snd_timer] BTF: -22
> [    5.945507] failed to validate module [snd_timer] BTF: -22
> 
> The full kernel log is available on [1]. The config used is available on
> [2] and the kernel/modules have been built using gcc-12.
> 
> The issue is still present on next-20241105.
> 
> I'm sending this report to track the regression while a fix is
> identified. The culprit commit hasn't been pinpointed yet, I'll report
> back once it's identified.
> 
> Any feedback or suggestion for additional debugging steps would be greatly 
> appreciated.
> 
> Best,
>

Thanks for the report! Judging from the config, you're seeing this with
pahole v1.24. I have seen issues like this in the past where during a
kernel build, module BTF has been built against vmlinux BTF, and then
something later re-triggers vmlinux BTF generation. If that re-triggered
vmlinux BTF does not use the same type ids for types, this can result in
mismatch errors as above since modules are referring to out-of-date type
ids in vmlinux. That's just a preliminary guess though, we'll
need more info to help get to the bottom of this.

A few suggestions to help debug this:

- if you have build logs, check BTF generation of vmlinux. Did it in
fact happen twice perhaps? Even better if, if kernel CI saves logs, feel
free to send a pointer and I'll take a look.
- can you post the vmlinux (stripped of DWARF data if possible to limit
size) and one of the failing modules somewhere so we can analyze?
- Failing that,
bpftool btf dump file /path/2/vmlinux_from_build > vmlinux.raw
and upload of the vmlinux.raw and one of the failing module .kos would help.

I've tried to reproduce this; no luck so far at my end.

Alan

> Laura
> 
> [1] https://pastebin.com/raw/dtvzBkxh
> [2] https://pastebin.com/raw/a1MGi3wH
> 
> #regzbot introduced: next-20241024..next-20241025
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ