lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fXF2x3hW4Sk+A362T-50cBbw6HVd7KY+QEUjFwT+JL37Q@mail.gmail.com>
Date: Tue, 26 Aug 2025 11:48:48 -0700
From: Ian Rogers <irogers@...gle.com>
To: Robin Murphy <robin.murphy@....com>
Cc: Peter Zijlstra <peterz@...radead.org>, mingo@...hat.com, will@...nel.org, 
	mark.rutland@....com, acme@...nel.org, namhyung@...nel.org, 
	alexander.shishkin@...ux.intel.com, jolsa@...nel.org, adrian.hunter@...el.com, 
	kan.liang@...ux.intel.com, linux-perf-users@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-alpha@...r.kernel.org, 
	linux-snps-arc@...ts.infradead.org, linux-arm-kernel@...ts.infradead.org, 
	imx@...ts.linux.dev, linux-csky@...r.kernel.org, loongarch@...ts.linux.dev, 
	linux-mips@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org, 
	linux-s390@...r.kernel.org, linux-sh@...r.kernel.org, 
	sparclinux@...r.kernel.org, linux-pm@...r.kernel.org, 
	linux-rockchip@...ts.infradead.org, dmaengine@...r.kernel.org, 
	linux-fpga@...r.kernel.org, amd-gfx@...ts.freedesktop.org, 
	dri-devel@...ts.freedesktop.org, intel-gfx@...ts.freedesktop.org, 
	intel-xe@...ts.freedesktop.org, coresight@...ts.linaro.org, 
	iommu@...ts.linux.dev, linux-amlogic@...ts.infradead.org, 
	linux-cxl@...r.kernel.org, linux-arm-msm@...r.kernel.org, 
	linux-riscv@...ts.infradead.org
Subject: Re: [PATCH 12/19] perf: Ignore event state for group validation

On Tue, Aug 26, 2025 at 8:32 AM Robin Murphy <robin.murphy@....com> wrote:
>
> On 2025-08-26 2:03 pm, Peter Zijlstra wrote:
> > On Wed, Aug 13, 2025 at 06:01:04PM +0100, Robin Murphy wrote:
> >> It may have been different long ago, but today it seems wrong for these
> >> drivers to skip counting disabled sibling events in group validation,
> >> given that perf_event_enable() could make them schedulable again, and
> >> thus increase the effective size of the group later. Conversely, if a
> >> sibling event is truly dead then it stands to reason that the whole
> >> group is dead, so it's not worth going to any special effort to try to
> >> squeeze in a new event that's never going to run anyway. Thus, we can
> >> simply remove all these checks.
> >
> > So currently you can do sort of a manual event rotation inside an
> > over-sized group and have it work.
> >
> > I'm not sure if anybody actually does this, but its possible.
> >
> > Eg. on a PMU that supports only 4 counters, create a group of 5 and
> > periodically cycle which of the 5 events is off.

I'm not sure this is true, I thought this would fail in the
perf_event_open when adding the 5th event and there being insufficient
counters for the group. Not all PMUs validate a group will fit on the
counters, but I thought at least Intel's core PMU would validate and
not allow this. Fwiw, the metric code is reliant on this behavior as
by default all events are placed into a weak group:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/metricgroup.c?h=perf-tools-next#n631
Weak groups are really just groups that when the perf_event_open fails
retry with the grouping removed. PMUs that don't fail the
perf_event_open are problematic as the reads just report "not counted"
and the metric doesn't work. Sometimes the PMU can't help it due to
errata. There are a bunch of workarounds for those cases carried in
the perf tool, but in general weak groups working is relied upon:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/pmu-events.h?h=perf-tools-next#n16

Thanks,
Ian

> > So I'm not against changing this, but changing stuff like this always
> > makes me a little fearful -- it wouldn't be the first time that when it
> > finally trickles down to some 'enterprise' user in 5 years someone comes
> > and finally says, oh hey, you broke my shit :-(
>
> Eww, I see what you mean... and I guess that's probably lower-overhead
> than actually deleting and recreating the sibling event(s) each time,
> and potentially less bother then wrangling multiple groups for different
> combinations of subsets when one simply must still approximate a complex
> metric that requires more counters than the hardware offers.
>
> I'm also not keen to break anything that wasn't already somewhat broken,
> especially since this patch is only intended as cleanup, so either we
> could just drop it altogether, or perhaps I can wrap the existing
> behaviour in a helper that can at least document this assumption and
> discourage new drivers from copying it. Am I right that only
> PERF_EVENT_STATE_{OFF,ERROR} would matter for this, though, and my
> reasoning for state <= PERF_EVENT_STATE_EXIT should still stand? As for
> the fiddly discrepancy with enable_on_exec between arm_pmu and others
> I'm not really sure what to think...
>
> Thanks,
> Robin.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ