lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250815105605.GA3245006@noisy.programming.kicks-ass.net>
Date: Fri, 15 Aug 2025 12:56:05 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: "Liang, Kan" <kan.liang@...ux.intel.com>
Cc: Yunseong Kim <ysk@...lloc.com>, Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Will Deacon <will@...nel.org>, Yeoreum Yun <yeoreum.yun@....com>,
	Austin Kim <austindh.kim@...il.com>,
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
	syzkaller@...glegroups.com
Subject: Re: [PATCH v3] perf: Avoid undefined behavior from stopping/starting
 inactive events

On Tue, Aug 12, 2025 at 04:51:28PM -0700, Liang, Kan wrote:
> 
> 
> On 2025-08-12 11:10 a.m., Yunseong Kim wrote:
> > Calling pmu->start()/stop() on perf events in PERF_EVENT_STATE_OFF can
> > leave event->hw.idx at -1. When PMU drivers later attempt to use this
> > negative index as a shift exponent in bitwise operations, it leads to UBSAN
> > shift-out-of-bounds reports.
> > 
> > The issue is a logical flaw in how event groups handle throttling when some
> > members are intentionally disabled. Based on the analysis and the
> > reproducer provided by Mark Rutland (this issue on both arm64 and x86-64).
> > 
> > The scenario unfolds as follows:
> > 
> >  1. A group leader event is configured with a very aggressive sampling
> >     period (e.g., sample_period = 1). This causes frequent interrupts and
> >     triggers the throttling mechanism.
> >  2. A child event in the same group is created in a disabled state
> >     (.disabled = 1). This event remains in PERF_EVENT_STATE_OFF.
> >     Since it hasn't been scheduled onto the PMU, its event->hw.idx remains
> >     initialized at -1.
> >  3. When throttling occurs, perf_event_throttle_group() and later
> >     perf_event_unthrottle_group() iterate through all siblings, including
> >     the disabled child event.
> >  4. perf_event_throttle()/unthrottle() are called on this inactive child
> >     event, which then call event->pmu->start()/stop().
> >  5. The PMU driver receives the event with hw.idx == -1 and attempts to
> >     use it as a shift exponent. e.g., in macros like PMCNTENSET(idx),
> >     leading to the UBSAN report.
> > 
> > The throttling mechanism attempts to start/stop events that are not
> > actively scheduled on the hardware.
> > 
> > Move the state check into perf_event_throttle()/perf_event_unthrottle() so
> > that inactive events are skipped entirely. This ensures only active events
> > with a valid hw.idx are processed, preventing undefined behavior and
> > silencing UBSAN warnings. The corrected check ensures true before
> > proceeding with PMU operations.
> > 
> > The problem can be reproduced with the syzkaller reproducer:
> > Link: https://lore.kernel.org/lkml/714b7ba2-693e-42e4-bce4-feef2a5e7613@kzalloc.com/
> > 
> > Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group")
> > Cc: Mark Rutland <mark.rutland@....com>
> > Signed-off-by: Yunseong Kim <ysk@...lloc.com>
> 
> Thanks for the fix.
> 
> Reviewed-by: Kan Liang <kan.liang@...ux.intel.com>

Thanks both!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ