linux-kernel - Re: [PATCH v3] perf: Avoid undefined behavior from stopping/starting inactive events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <922f69f6-e290-46f6-af6f-5a71e4508cf0@linux.intel.com>
Date: Tue, 12 Aug 2025 16:51:28 -0700
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Yunseong Kim <ysk@...lloc.com>, Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>
Cc: Mark Rutland <mark.rutland@....com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>,
 Adrian Hunter <adrian.hunter@...el.com>, Will Deacon <will@...nel.org>,
 Yeoreum Yun <yeoreum.yun@....com>, Austin Kim <austindh.kim@...il.com>,
 linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
 syzkaller@...glegroups.com
Subject: Re: [PATCH v3] perf: Avoid undefined behavior from stopping/starting
 inactive events



On 2025-08-12 11:10 a.m., Yunseong Kim wrote:
> Calling pmu->start()/stop() on perf events in PERF_EVENT_STATE_OFF can
> leave event->hw.idx at -1. When PMU drivers later attempt to use this
> negative index as a shift exponent in bitwise operations, it leads to UBSAN
> shift-out-of-bounds reports.
> 
> The issue is a logical flaw in how event groups handle throttling when some
> members are intentionally disabled. Based on the analysis and the
> reproducer provided by Mark Rutland (this issue on both arm64 and x86-64).
> 
> The scenario unfolds as follows:
> 
>  1. A group leader event is configured with a very aggressive sampling
>     period (e.g., sample_period = 1). This causes frequent interrupts and
>     triggers the throttling mechanism.
>  2. A child event in the same group is created in a disabled state
>     (.disabled = 1). This event remains in PERF_EVENT_STATE_OFF.
>     Since it hasn't been scheduled onto the PMU, its event->hw.idx remains
>     initialized at -1.
>  3. When throttling occurs, perf_event_throttle_group() and later
>     perf_event_unthrottle_group() iterate through all siblings, including
>     the disabled child event.
>  4. perf_event_throttle()/unthrottle() are called on this inactive child
>     event, which then call event->pmu->start()/stop().
>  5. The PMU driver receives the event with hw.idx == -1 and attempts to
>     use it as a shift exponent. e.g., in macros like PMCNTENSET(idx),
>     leading to the UBSAN report.
> 
> The throttling mechanism attempts to start/stop events that are not
> actively scheduled on the hardware.
> 
> Move the state check into perf_event_throttle()/perf_event_unthrottle() so
> that inactive events are skipped entirely. This ensures only active events
> with a valid hw.idx are processed, preventing undefined behavior and
> silencing UBSAN warnings. The corrected check ensures true before
> proceeding with PMU operations.
> 
> The problem can be reproduced with the syzkaller reproducer:
> Link: https://lore.kernel.org/lkml/714b7ba2-693e-42e4-bce4-feef2a5e7613@kzalloc.com/
> 
> Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group")
> Cc: Mark Rutland <mark.rutland@....com>
> Signed-off-by: Yunseong Kim <ysk@...lloc.com>

Thanks for the fix.

Reviewed-by: Kan Liang <kan.liang@...ux.intel.com>

Thanks,
Kan

> ---
>  kernel/events/core.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 8060c2857bb2..872122e074e5 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2665,6 +2665,9 @@ static void perf_log_itrace_start(struct perf_event *event);
>  
>  static void perf_event_unthrottle(struct perf_event *event, bool start)
>  {
> +	if (event->state != PERF_EVENT_STATE_ACTIVE)
> +		return;
> +
>  	event->hw.interrupts = 0;
>  	if (start)
>  		event->pmu->start(event, 0);
> @@ -2674,6 +2677,9 @@ static void perf_event_unthrottle(struct perf_event *event, bool start)
>  
>  static void perf_event_throttle(struct perf_event *event)
>  {
> +	if (event->state != PERF_EVENT_STATE_ACTIVE)
> +		return;
> +
>  	event->hw.interrupts = MAX_INTERRUPTS;
>  	event->pmu->stop(event, 0);
>  	if (event == event->group_leader)