[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1aec4a9a-a30b-43fd-b303-7a351caeccb7@redhat.com>
Date: Mon, 7 Apr 2025 10:42:56 +0200
From: Viktor Malik <vmalik@...hat.com>
To: Shung-Hsi Yu <shung-hsi.yu@...e.com>, "Naveen N. Rao"
<naveen@...nel.org>, Hari Bathini <hbathini@...ux.ibm.com>,
bpf@...r.kernel.org
Cc: Michael Ellerman <mpe@...erman.id.au>, Mark Rutland
<mark.rutland@....com>, Daniel Borkmann <daniel@...earbox.net>,
Masahiro Yamada <masahiroy@...nel.org>, Nicholas Piggin <npiggin@...il.com>,
Alexei Starovoitov <ast@...nel.org>, Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>, Andrii Nakryiko <andrii@...nel.org>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Vishal Chourasia <vishalc@...ux.ibm.com>,
Mahesh J Salgaonkar <mahesh@...ux.ibm.com>, Miroslav Benes <mbenes@...e.cz>,
Michal Suchánek <msuchanek@...e.de>,
linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-trace-kernel@...r.kernel.org, live-patching@...r.kernel.org
Subject: Re: [BUG?] ppc64le: fentry BPF not triggered after live patch (v6.14)
On 3/31/25 15:19, Shung-Hsi Yu wrote:
> Hi all,
>
> On ppc64le (v6.14, kernel config attached), I've observed that fentry
> BPF programs stop being invoked after the target kernel function is live
> patched. This occurs regardless of whether the BPF program was attached
> before or after the live patch. I believe fentry/fprobe on ppc64le is
> added with [1].
>
> Steps to reproduce on ppc64le:
> - Use bpftrace (v0.10.0+) to attach a BPF program to cmdline_proc_show
> with fentry (kfunc is the older name bpftrace used for fentry, used
> here for max compatability)
>
> bpftrace -e 'kfunc:cmdline_proc_show { printf("%lld: cmdline_proc_show() called by %s\n", nsecs(), comm) }'
>
> - Run `cat /proc/cmdline` and observe bpftrace output
>
> - Load samples/livepatch/livepatch-sample.ko
>
> - Run `cat /proc/cmdline` again. Observe "this has been live patched" in
> output, but no new bpftrace output.
>
> Note: once the live patching module is disabled through the sysfs interface
> the BPF program invocation is restored.
>
> Is this the expected interaction between fentry BPF and live patching?
> On x86_64 it does _not_ happen, so I'd guess the behavior on ppc64le is
> unintended. Any insights appreciated.
I'm not sure if this is related but I found out that when a kernel is
compiled with KASAN=y (full config attached), the above steps without
the bpftrace part lead to a kernel panic upon running the second `cat
/proc/cmdline` command (the livepatched one).
Here's the relevant part of the kdump:
[ 457.405298] BUG: Unable to handle kernel data access on write at 0xc0000000000f9078
[ 457.405320] Faulting instruction address: 0xc0000000018ff958
[ 457.405328] Oops: Kernel access of bad area, sig: 11 [#1]
[ 457.405336] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=8192 NUMA pSeries
[ 457.405347] Modules linked in: livepatch_sample(K) bonding tls rfkill vmx_crypto ibmveth pseries_rng sg fuse loop nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vsock xfs sd_mod ibmvscsi scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
[ 457.405410] CPU: 6 UID: 0 PID: 5141 Comm: cat Kdump: loaded Tainted: G K 6.14.0+ #9 VOLUNTARY
[ 457.405424] Tainted: [K]=LIVEPATCH
[ 457.405430] Hardware name: IBM,9009-22A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW910.00 (VL910_062) hv:phyp pSeries
[ 457.405440] NIP: c0000000018ff958 LR: c0000000018ff930 CTR: c0000000009c0790
[ 457.405449] REGS: c00000005f2e7790 TRAP: 0300 Tainted: G K (6.14.0+)
[ 457.405459] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 2822880b XER: 20040000
[ 457.405484] CFAR: c0000000008addc0 DAR: c0000000000f9078 DSISR: 0a000000 IRQMASK: 1
GPR00: c0000000018f2584 c00000005f2e7a30 c00000000280a900 c000000017ffa488
GPR04: 0000000000000008 0000000000000000 c0000000018f24fc 000000000000000d
GPR08: fffffffffffe0000 000000000000000d 0000000000000000 0000000000008000
GPR12: c0000000009c0790 c000000017ffa480 c00000005f2e7c78 c0000000000f9070
GPR16: c00000005f2e7c90 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 c00000005f3efa80 c00000005f2e7c60 c00000005f2e7c88
GPR24: c00000005f2e7c60 0000000000000001 c0000000000f9078 0000000000000000
GPR28: 00007fff97960000 c000000017ffa480 0000000000000000 c0000000000f9078
[ 457.405605] NIP [c0000000018ff958] _raw_spin_lock_irqsave+0x68/0x110
[ 457.405619] LR [c0000000018ff930] _raw_spin_lock_irqsave+0x40/0x110
[ 457.405630] Call Trace:
[ 457.405635] [c00000005f2e7a30] [c000000000941804] check_heap_object+0x34/0x390 (unreliable)
[ 457.405651] [c00000005f2e7a70] [c0000000018f2584] __mutex_unlock_slowpath.isra.0+0xe4/0x230
[ 457.405665] [c00000005f2e7af0] [c0000000009c2f50] seq_read_iter+0x430/0xa90
[ 457.405679] [c00000005f2e7c00] [c000000000aade04] proc_reg_read_iter+0xa4/0x200
[ 457.405692] [c00000005f2e7c40] [c00000000095345c] vfs_read+0x41c/0x510
[ 457.405705] [c00000005f2e7d30] [c0000000009545d4] ksys_read+0xa4/0x190
[ 457.405716] [c00000005f2e7d90] [c00000000003a3f0] system_call_exception+0x1d0/0x440
[ 457.405729] [c00000005f2e7e50] [c00000000000cedc] system_call_vectored_common+0x15c/0x2ec
[ 457.405744] --- interrupt: 3000 at 0x7fff97e75044
[ 457.405755] NIP: 00007fff97e75044 LR: 00007fff97e75044 CTR: 0000000000000000
[ 457.405764] REGS: c00000005f2e7e80 TRAP: 3000 Tainted: G K (6.14.0+)
[ 457.405773] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 48222804 XER: 00000000
[ 457.405805] IRQMASK: 0
GPR00: 0000000000000003 00007fffc1908930 00007fff97f87100 0000000000000003
GPR04: 00007fff97960000 0000000000040000 0000000000000000 00007fff97f80248
GPR08: 0000000000000002 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 00007fff9805a5a0 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000040000 00007fffc19091c8 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 00007fff9804f470
GPR24: 0000000000000000 0000000000040000 00007fffc190f1c5 000000007ff00000
GPR28: 0000000000000003 00007fff97960000 0000000000040000 0000000000000003
[ 457.405916] NIP [00007fff97e75044] 0x7fff97e75044
[ 457.405924] LR [00007fff97e75044] 0x7fff97e75044
[ 457.405932] --- interrupt: 3000
[ 457.405938] Code: 386d0008 4afae43d 60000000 a13d0008 3d00fffe 5529083c 61290001 7d40f829 7d474079 40c20018 7d474038 7ce74b78 <7ce0f92d> 40c2ffe8 7c2004ac 794a03e1
[ 457.405981] ---[ end trace 0000000000000000 ]---
[ 457.419259] pstore: backend (nvram) writing error (-1)
Interestingly, the panic doesn't occur when the bpftrace process is
running. Then, running `cat /proc/cmdline` works (even prints the
expected livepatched message) but doesn't appear in bpftrace output, as
Shung-Hsi observed.
On a kernel with KASAN=n, no panic happens.
This panic doesn't seem to be related to BPF (as it happens when no BPF
programs are involved) but it involves livepatch and occurs for the same
sequence of commands, so the two cases may be related. In this case, I
suspect that the issue is caused by an incorrect interaction of
livepatch and the ftrace changes introduced for BPF trampolines [1].
FWIW, there is patch cfec8463d9a1 ("powerpc/ftrace: Fix ftrace bug with
KASAN=y") which is fixing a bug in [1] appearing on KASAN=y kernel but
I'm not sure if it's related to this issue.
Viktor
[1] https://lore.kernel.org/all/20241030070850.1361304-1-hbathini@linux.ibm.com/
>
>
> Thanks,
> Shung-Hsi Yu
>
> 1: https://lore.kernel.org/all/20241030070850.1361304-2-hbathini@linux.ibm.com/
>
View attachment ".config" of type "text/plain" (141341 bytes)
Powered by blists - more mailing lists