[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aIEePonPatjNrJVk@J2N7QTR9R3>
Date: Wed, 23 Jul 2025 18:39:10 +0100
From: Mark Rutland <mark.rutland@....com>
To: Yunseong Kim <ysk@...lloc.com>
Cc: Will Deacon <will@...nel.org>, Austin Kim <austindh.kim@...il.com>,
Michelle Jin <shjy180909@...il.com>,
linux-arm-kernel@...ts.infradead.org,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
Yeoreum Yun <yeoreum.yun@....com>, syzkaller@...glegroups.com,
Kan Liang <kan.liang@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>,
Namhyung Kim <namhyung@...nel.org>
Subject: Re: [PATCH] perf: arm_pmuv3: Fix kernel panic on UBSAN from negative
hw.idx in armv8pmu_enable/disable_event()
On Wed, Jul 23, 2025 at 01:35:31PM +0100, Mark Rutland wrote:
> [ dropping Hemendra, since he doens't need to be spammed with ML traffic ]
>
> On Wed, Jul 23, 2025 at 10:44:03AM +0000, Yunseong Kim wrote:
> > When 'event->hw.idx' was negative in armv8pmu_enable/disable_event().
> >
> > UBSAN: shift-out-of-bounds in drivers/perf/arm_pmuv3.c:716:25
> > shift exponent -1 is negative
> >
> > UBSAN: shift-out-of-bounds in drivers/perf/arm_pmuv3.c:658:13
> > shift exponent -1 is negative
> >
> > This occurred because a perf_event could reach armv8pmu event with a
> > negative idx, typically when a valid counter could not be allocated.
>
> These functions are never supposed to be called for an event with a
> negative idx. For that to happen there must either be an earlier bug (at
> the time of pmu::add()) or there's a concurrency bug.
AFAICT this is a result of the group throttling logic introduced in
commit:
9734e25fbf5ae68e ("perf: Fix the throttle logic for a group")
That doesn't take into account that sibling events could have
event->state <= PERF_EVENT_STATE_OFF, e.g. by virtue of
perf_event_attr::disabled. For those events, event_sched_in() won't
initialise the event, e.g. won't call event->pmu->add().
Thus when perf_event_throttle_group() and perf_event_unthrottle_group()
iterate over events with:
for_each_sibling_event(sibling, leader)
perf_event_[un]throttle(sibling, ...);
... perf_event_[un]throttle() will call event->pmu->stop() and
event->pmu->start() for those disabled events, resulting in the UBSAN
splat above since they don't have a hw idx (which is assigned by
event->pmu->add()).
I think the event's state needs to be taken into account somewhere
during throttling. Given the sample of event_sched_in(), I'd assume that
should be in the core code, rather than in each architecture's
pmu::{start,stop}().
I can reproduce this locally with:
| #include <stdio.h>
| #include <stdlib.h>
| #include <unistd.h>
|
| #include <sys/syscall.h>
| #include <sys/types.h>
|
| #include <linux/perf_event.h>
|
| static int perf_event_open(struct perf_event_attr *attr, pid_t pid, int cpu,
| int group_fd, unsigned long flags)
| {
| return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
| }
|
| struct perf_event_attr attr_parent = {
| .type = PERF_TYPE_HARDWARE,
| .size = sizeof(attr_parent),
| .config = PERF_COUNT_HW_CPU_CYCLES,
| .sample_period = 1,
| .exclude_kernel = 1,
| };
|
| struct perf_event_attr attr_child = {
| .type = PERF_TYPE_HARDWARE,
| .size = sizeof(attr_child),
| .config = PERF_COUNT_HW_CPU_CYCLES,
| .exclude_kernel = 1,
| .disabled = 1,
| };
|
| int main(int argc, char *argv[])
| {
| int parent, child;
|
| parent = perf_event_open(&attr_parent, 0, -1, -1, 0);
| if (parent < 0) {
| fprintf(stderr, "Unable to create event: %d\n", parent);
| exit (-1);
| }
|
| child = perf_event_open(&attr_child, 0, -1, parent, 0);
| if (child < 0) {
| fprintf(stderr, "Unable to create event: %d\n", child);
| exit (-1);
| }
|
| for (;;) {
| asm("" ::: "memory");
| }
|
| return 0;
| }
I've kept the context below for anyone new to this thread, but hopefully
the above is clear?
Mark.
> > This issue was observed when running KVM on Radxa's Orion6 platform.
>
> Do you mean that you see this on the host, or in a guest?
>
> Are you using pseudo-NMI?
>
> > The issue was previously guarded indirectly by armv8pmu_event_is_chained(),
> > which internally warned and returned false for idx < 0. But since the
> > commit 29227d6ea157 ("arm64: perf: Clean up enable/disable calls"), this
> > check was removed.
>
> The warning there was because this case *should not happen*, and
> returning false was a way of minimizing the risk of a crash before the
> warning was logged.
>
> I don't think that armv8pmu_event_is_chained() would avoid the UBSAN
> splat. Prior to commit 29227d6ea157 we had:
>
> | static inline void armv8pmu_enable_event_counter(struct perf_event *event)
> | {
> | struct perf_event_attr *attr = &event->attr;
> | int idx = event->hw.idx;
> |
> | ...
> |
> | if (!kvm_pmu_counter_deferred(attr)) {
> | armv8pmu_enable_counter(idx);
> | if (armv8pmu_event_is_chained(event))
> | armv8pmu_enable_counter(idx - 1);
> | }
> | }
>
> Note the first call to armv8pmu_enable_counter(idx), so this wouldn't
> help for an event with event->hw.idx==-1, and the only other way we
> could get here is with a chained event with event->hw.idx==0, which is
> not valid.
>
> > To prevent undefined behavior, add an explicit guard to early return from
> > armv8pmu event if hw.idx < 0,
>
> That is not a correct fix, and simply hides the real bug. It should not
> be possible to reach this code when hw.idx < 0, and idx should be >= 0
> whenever pmu::add() succeeds.
>
> > similar to handling in other PMU drivers.
> > (e.g. intel_pmu_disable_event() on arch/x86/events/intel/core.c)
>
> I think what you're saying here is that intel_pmu_disable_event() will
> pr_warn() and return early in this case. As above, that is because this
> case is not expected to occur, and indicates a bug elsewhere.
>
> > $ ./syz-execprog -executor=./syz-executor -repeat=0 -sandbox=none \
> > -disable=binfmt_misc,cgroups,close_fds,devlink_pci,ieee802154,net_dev,net_reset,nic_vf,swap,sysctl,tun,usb,vhci,wifi \
> > -procs=8 perf.syz
> >
> > r0 = perf_event_open(&(0x7f0000000240)={
> > 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > @perf_config_ext
> > }, 0x0, 0x0, 0xffffffffffffffff, 0x0)
> >
> > perf_event_open(&(0x7f0000000280)={
> > 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > @perf_config_ext
> > }, 0x0, 0x0, r0, 0x0)
> >
> > perf_event_open(&(0x7f0000000540)={
> > 0x3, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4,
> > @perf_bp={0x0, 0x2}, 0x20, 0x0, 0x0, 0x0, 0x2,
> > 0x0, 0x0, 0x0, 0x0, 0x0, 0x81
> > }, 0x0, 0x0, r0, 0x0)
>
> This isn't all that helpful for reproducing the issue. Are the later
> lines the contents of 'perf.syz'? My local build of syz-execprog can't
> seem to parse this and prints help/usage.
>
> Has your syzkaller instance managed to generate a C reproducer that you
> can share?
>
> It should be possible to manually build a test from the above, but
> that's rather tedious.
>
> > ------------[ cut here ]------------
> > UBSAN: shift-out-of-bounds in drivers/perf/arm_pmuv3.c:716:25
> > shift exponent -1 is negative
> > CPU: 0 UID: 0 PID: 8405 Comm: syz.3.19 Tainted: G W 6.16.0-rc2-g5982a539cdce #3 PREEMPT
> > Tainted: [W]=WARN
> > Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8 05/13/2025
> > Call trace:
> > show_stack+0x2c/0x3c (C)
> > __dump_stack+0x30/0x40
> > dump_stack_lvl+0xd8/0x12c
> > dump_stack+0x1c/0x28
> > ubsan_epilogue+0x14/0x48
> > __ubsan_handle_shift_out_of_bounds+0x2b0/0x34c
> > armv8pmu_enable_event+0x3c4/0x4b0
> > armpmu_start+0xc4/0x118
> > perf_event_unthrottle_group+0x3a8/0x50c
> > perf_adjust_freq_unthr_events+0x2f4/0x578
> > perf_adjust_freq_unthr_context+0x278/0x46c
> > perf_event_task_tick+0x394/0x5b0
>
> AFAICT perf_event_task_tick() is called with IRQs masked, and
> perf_adjust_freq_unthr_context() disables the PMU for the duration of
> the state manipulation, so this shouldn't be able to race with anything
> that's using appropriate IRQ masking.
>
> This might be able to race with a pNMI though, and it looks like we're
> not entirely robust.
>
> > sched_tick+0x314/0x6cc
> > update_process_times+0x374/0x4b0
> > tick_nohz_handler+0x334/0x480
> > __hrtimer_run_queues+0x3ec/0xb78
> > hrtimer_interrupt+0x2b8/0xb50
> > arch_timer_handler_virt+0x74/0x88
> > handle_percpu_devid_irq+0x174/0x308
> > generic_handle_domain_irq+0xe0/0x140
> > gic_handle_irq+0x6c/0x190
> > call_on_irq_stack+0x24/0x30
> > do_interrupt_handler+0xd4/0x138
> > el1_interrupt+0x34/0x54
> > el1h_64_irq_handler+0x18/0x24
> > el1h_64_irq+0x6c/0x70
> > generic_exec_single+0x2dc/0x304 (P)
> > smp_call_function_single+0x308/0x530
> > perf_install_in_context+0x4a0/0x798
> > __arm64_sys_perf_event_open+0x184c/0x1d50
> > invoke_syscall+0x98/0x2b8
> > el0_svc_common+0x130/0x23c
> > do_el0_svc+0x48/0x58
> > el0_svc+0x58/0x17c
> > el0t_64_sync_handler+0x78/0x108
> > el0t_64_sync+0x198/0x19c
> > ---[ end trace ]---
> > ------------[ cut here ]------------
> > UBSAN: shift-out-of-bounds in drivers/perf/arm_pmuv3.c:658:13
> > shift exponent -1 is negative
> > CPU: 0 UID: 0 PID: 8006 Comm: syz.0.19 Not tainted 6.16.0-rc2-g5982a539cdce #3 PREEMPT
> > Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8 05/13/2025
> > Call trace:
> > show_stack+0x2c/0x3c (C)
> > __dump_stack+0x30/0x40
> > dump_stack_lvl+0xd8/0x12c
> > dump_stack+0x1c/0x28
> > ubsan_epilogue+0x14/0x48
> > __ubsan_handle_shift_out_of_bounds+0x2b0/0x34c
> > armv8pmu_disable_event+0x228/0x2f8
> > armpmu_stop+0xa0/0x104
> > perf_event_throttle_group+0x354/0x4cc
> > __perf_event_account_interrupt+0x26c/0x290
> > __perf_event_overflow+0xe8/0xd28
> > perf_event_overflow+0x38/0x4c
> > armv8pmu_handle_irq+0x244/0x320
> > armpmu_dispatch_irq+0x6c/0x9c
> > handle_percpu_devid_irq+0x174/0x308
> > generic_handle_domain_irq+0xe0/0x140
> > gic_handle_irq+0x6c/0x190
> > call_on_irq_stack+0x24/0x30
> > do_interrupt_handler+0xd4/0x138
> > el1_interrupt+0x34/0x54
> > el1h_64_irq_handler+0x18/0x24
> > el1h_64_irq+0x6c/0x70
>
> This indicates a regular IRQ rather than a pNMI.
>
> > generic_exec_single+0x2dc/0x304 (P)
> > smp_call_function_single+0x308/0x530
> > perf_install_in_context+0x4a0/0x798
> > __arm64_sys_perf_event_open+0x184c/0x1d50
> > invoke_syscall+0x98/0x2b8
> > el0_svc_common+0x130/0x23c
> > do_el0_svc+0x48/0x58
> > el0_svc+0x58/0x17c
> > el0t_64_sync_handler+0x78/0x108
> > el0t_64_sync+0x198/0x19c
> > ---[ end trace ]---
> > ------------[ cut here ]------------
> > UBSAN: shift-out-of-bounds in drivers/perf/arm_pmuv3.c:730:26
> > shift exponent -1 is negative
> > CPU: 0 UID: 0 PID: 8006 Comm: syz.0.19 Not tainted 6.16.0-rc2-g5982a539cdce #3 PREEMPT
> > Hardware name: QEMU KVM Virtual Machine, BIOS 2025.02-8 05/13/2025
> > Call trace:
> > show_stack+0x2c/0x3c (C)
> > __dump_stack+0x30/0x40
> > dump_stack_lvl+0xd8/0x12c
> > dump_stack+0x1c/0x28
> > ubsan_epilogue+0x14/0x48
> > __ubsan_handle_shift_out_of_bounds+0x2b0/0x34c
> > armv8pmu_disable_event+0x298/0x2f8
> > armpmu_stop+0xa0/0x104
> > perf_event_throttle_group+0x354/0x4cc
> > __perf_event_account_interrupt+0x26c/0x290
> > __perf_event_overflow+0xe8/0xd28
> > perf_event_overflow+0x38/0x4c
> > armv8pmu_handle_irq+0x244/0x320
> > armpmu_dispatch_irq+0x6c/0x9c
> > handle_percpu_devid_irq+0x174/0x308
> > generic_handle_domain_irq+0xe0/0x140
> > gic_handle_irq+0x6c/0x190
> > call_on_irq_stack+0x24/0x30
> > do_interrupt_handler+0xd4/0x138
> > el1_interrupt+0x34/0x54
> > el1h_64_irq_handler+0x18/0x24
> > el1h_64_irq+0x6c/0x70
> > generic_exec_single+0x2dc/0x304 (P)
> > smp_call_function_single+0x308/0x530
> > perf_install_in_context+0x4a0/0x798
> > __arm64_sys_perf_event_open+0x184c/0x1d50
> > invoke_syscall+0x98/0x2b8
> > el0_svc_common+0x130/0x23c
> > do_el0_svc+0x48/0x58
> > el0_svc+0x58/0x17c
> > el0t_64_sync_handler+0x78/0x108
> > el0t_64_sync+0x198/0x19c
> > ---[ end trace ]---
> >
> > Fixes: 29227d6ea157 ("arm64: perf: Clean up enable/disable calls")
>
> As above, I do not believe that this fixes tag is accurate.
>
> > Signed-off-by: Yunseong Kim <ysk@...lloc.com>
> > Tested-by: Yunseong Kim <ysk@...lloc.com>
> > Cc: Yeoreum Yun <yeoreum.yun@....com>
> > Cc: syzkaller@...glegroups.com
> > ---
> > drivers/perf/arm_pmuv3.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/perf/arm_pmuv3.c b/drivers/perf/arm_pmuv3.c
> > index 3db9f4ed17e8..846d69643fd8 100644
> > --- a/drivers/perf/arm_pmuv3.c
> > +++ b/drivers/perf/arm_pmuv3.c
> > @@ -795,6 +795,9 @@ static void armv8pmu_enable_user_access(struct arm_pmu *cpu_pmu)
> >
> > static void armv8pmu_enable_event(struct perf_event *event)
> > {
> > + if (unlikely(event->hw.idx < 0))
> > + return;
> > +
> > armv8pmu_write_event_type(event);
> > armv8pmu_enable_event_irq(event);
> > armv8pmu_enable_event_counter(event);
> > @@ -802,6 +805,9 @@ static void armv8pmu_enable_event(struct perf_event *event)
> >
> > static void armv8pmu_disable_event(struct perf_event *event)
> > {
> > + if (unlikely(event->hw.idx < 0))
> > + return;
> > +
> > armv8pmu_disable_event_counter(event);
> > armv8pmu_disable_event_irq(event);
>
> As above, this is not a correct fix, and NAK to silently ignoring an
> invalid idx.
>
> Mark.
>
Powered by blists - more mailing lists