lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 20 May 2021 22:31:18 -0700
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Michal Suchánek <msuchanek@...e.de>
Cc:     bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>,
        Networking <netdev@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Jiri Olsa <jolsa@...nel.org>
Subject: Re: BPF: failed module verification on linux-next

On Wed, May 19, 2021 at 7:19 AM Michal Suchánek <msuchanek@...e.de> wrote:
>
> Hello,
>
> linux-next fails to boot for me:
>
> [    0.000000] Linux version 5.13.0-rc2-next-20210519-1.g3455ff8-vanilla (geeko@...ldhost) (gcc (SUSE Linux) 10.3.0, GNU ld (GNU Binutils;
> openSUSE Tumbleweed) 2.36.1.20210326-3) #1 SMP Wed May 19 10:05:10 UTC 2021 (3455ff8)
> [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla root=UUID=ec42c33e-a2c2-4c61-afcc-93e9527
> 8f687 plymouth.enable=0 resume=/dev/disk/by-uuid/f1fe4560-a801-4faf-a638-834c407027c7 mitigations=auto earlyprintk initcall_debug nomodeset
>  earlycon ignore_loglevel console=ttyS0,115200
> ...
> [   26.093364] calling  tracing_set_default_clock+0x0/0x62 @ 1
> [   26.098937] initcall tracing_set_default_clock+0x0/0x62 returned 0 after 0 usecs
> [   26.106330] calling  acpi_gpio_handle_deferred_request_irqs+0x0/0x7c @ 1
> [   26.113033] initcall acpi_gpio_handle_deferred_request_irqs+0x0/0x7c returned 0 after 3 usecs
> [   26.121559] calling  clk_disable_unused+0x0/0x102 @ 1
> [   26.126620] initcall clk_disable_unused+0x0/0x102 returned 0 after 0 usecs
> [   26.133491] calling  regulator_init_complete+0x0/0x25 @ 1
> [   26.138890] initcall regulator_init_complete+0x0/0x25 returned 0 after 0 usecs
> [   26.147816] Freeing unused decrypted memory: 2036K
> [   26.153682] Freeing unused kernel image (initmem) memory: 2308K
> [   26.165776] Write protecting the kernel read-only data: 26624k
> [   26.173067] Freeing unused kernel image (text/rodata gap) memory: 2036K
> [   26.180416] Freeing unused kernel image (rodata/data gap) memory: 1184K
> [   26.187031] Run /init as init process
> [   26.190693]   with arguments:
> [   26.193661]     /init
> [   26.195933]   with environment:
> [   26.199079]     HOME=/
> [   26.201444]     TERM=linux
> [   26.204152]     BOOT_IMAGE=/boot/vmlinuz-5.13.0-rc2-next-20210519-1.g3455ff8-vanilla
> [   26.254154] BPF:      type_id=35503 offset=178440 size=4
> [   26.259125] BPF:
> [   26.261054] BPF:Invalid offset
> [   26.264119] BPF:

It took me a while to reliably bisect this, but it clearly points to
this commit:

e481fac7d80b ("mm/page_alloc: convert per-cpu list protection to local_lock")

One commit before it, 676535512684 ("mm/page_alloc: split per cpu page
lists and zone stats -fix"), works just fine.

I'll have to spend more time debugging what exactly is happening, but
the immediate problem is two different definitions of numa_node
per-cpu variable. They both are at the same offset within
.data..percpu ELF section, they both have the same name, but one of
them is marked as static and another as global. And one is int
variable, while another is struct pagesets. I'll look some more
tomorrow, but adding Jiri and Arnaldo for visibility.

[110907] DATASEC '.data..percpu' size=178904 vlen=303
...
        type_id=27753 offset=163976 size=4 (VAR 'numa_node')
        type_id=27754 offset=163976 size=4 (VAR 'numa_node')

[27753] VAR 'numa_node' type_id=27556, linkage=static
[27754] VAR 'numa_node' type_id=20, linkage=global

[20] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED

[27556] STRUCT 'pagesets' size=0 vlen=1
        'lock' type_id=507 bits_offset=0

[506] STRUCT '(anon)' size=0 vlen=0
[507] TYPEDEF 'local_lock_t' type_id=506

So also something weird about those zero-sized struct pagesets and
local_lock_t inside it.

> [   26.264119]
> [   26.267437] failed to validate module [efivarfs] BTF: -22
> [   26.316724] systemd[1]: systemd 246.13+suse.105.g14581e0120 running in system mode. (+PAM +AUDIT +SELINUX -IMA +APPARMOR -SMACK +SYSVINI
> T +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified)
> [   26.357990] systemd[1]: Detected architecture x86-64.
> [   26.363068] systemd[1]: Running in initial RAM disk.
>

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ