[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cf7ae59b735f004b5c6dd53b82e3d3e2acfad973.camel@intel.com>
Date: Thu, 2 Oct 2025 15:38:29 +0000
From: "Falcon, Thomas" <thomas.falcon@...el.com>
To: "alexander.shishkin@...ux.intel.com" <alexander.shishkin@...ux.intel.com>,
"peterz@...radead.org" <peterz@...radead.org>, "acme@...nel.org"
<acme@...nel.org>, "dapeng1.mi@...ux.intel.com" <dapeng1.mi@...ux.intel.com>,
"mingo@...hat.com" <mingo@...hat.com>, "Hunter, Adrian"
<adrian.hunter@...el.com>, "namhyung@...nel.org" <namhyung@...nel.org>,
"jolsa@...nel.org" <jolsa@...nel.org>, "kan.liang@...ux.intel.com"
<kan.liang@...ux.intel.com>, "irogers@...gle.com" <irogers@...gle.com>,
"mark.rutland@....com" <mark.rutland@....com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-perf-users@...r.kernel.org" <linux-perf-users@...r.kernel.org>,
"ak@...ux.intel.com" <ak@...ux.intel.com>
Subject: Re: [RESEND][PATCH v2 0/2] perf record: ratio-to-prev event term for
auto counter reload
On Tue, 2025-09-30 at 15:28 +0800, Mi, Dapeng wrote:
>
> On 9/3/2025 12:40 AM, Thomas Falcon wrote:
> > The Auto Counter Reload (ACR)[1] feature is used to track the
> > relative rates of two or more perf events, only sampling
> > when a given threshold is exceeded. This helps reduce overhead
> > and unnecessary samples. However, enabling this feature
> > currently requires setting two parameters:
> >
> > -- Event sampling period ("period")
> > -- acr_mask, which determines which events get reloaded
> > when the sample period is reached.
> >
> > For example, in the following command:
> >
> > perf record -e "{cpu_atom/branch-misses,period=200000,\
> > acr_mask=0x2/ppu,cpu_atom/branch-instructions,period=1000000,\
> > acr_mask=0x3/u}" -- ./mispredict
> >
> > The goal is to limit event sampling to cases when the
> > branch miss rate exceeds 20%. If the branch instructions
> > sample period is exceeded first, both events are reloaded.
> > If branch misses exceed their threshold first, only the
> > second counter is reloaded, and a sample is taken.
> >
> > To simplify this, provide a new “ratio-to-prev” event term
> > that works alongside the period event option or -c option.
> > This would allow users to specify the desired relative rate
> > between events as a ratio, making configuration more intuitive.
> >
> > With this enhancement, the equivalent command would be:
> >
> > perf record -e "{cpu_atom/branch-misses/ppu,\
> > cpu_atom/branch-instructions,period=1000000,ratio_to_prev=5/u}" \
> > -- ./mispredict
>
> Hi Tom,
>
> Does this "ratio-to-prev" option support 3 and more events in ACR
> group?
>
Hi Dapeng,
The 'ratio-to-prev' option only supports groups with two events at this
time. For larger event groups, the "acr_mask" term is available.
> If not, should we consider to support the cases there are 3 and more
> events
> in the ACR group? (If I remember correct, the PMU driver should
> support it).
>
Correct.
> e.g.,
>
> perf record -e
> "{cpu_atom/branch-
> misses,period=200000,acr_mask=0x6/p,cpu_atom/branches,period=1000000,
> acr_mask=0x7/,cpu_atom/branches,period=1000000,acr_mask=0x7/}"
> -- sleep 1
>
> Of course, this is just an example that indicates the cases are
> supported,
> it doesn't mean the command is meaningful. But we can't exclude that
> users
> have such real requirements.
>
> If we want to support 3 and more events in ACR group (if not
> already), we'd
> better rename the "ratio-to-prev" option to "ratio-to-head" and only
> allow
> the group leader can be set the sampling period explicitly with
> "period"
> option and the sampling period of all other group members can only be
> calculated base on the sampling period of group leader and
> the "ratio-to-head", maybe like this.
>
> perf record -e
> "{cpu_atom/branch-misses,period=200000/p,cpu_atom/branches,ratio-to-
> head=5/,cpu_atom/branches,ratio-to-head=5/}"
> -- sleep 1
>
> Thanks.
>
>
Thanks, those are good suggestions, but the goal of the feature was to
provide users a way to utilize ACR to make simple comparisons without
needing to use the "acr_mask" field. For tests comparing larger event
groups, the acr_mask field may be used instead.
Thanks,
Tom
> >
> > or
> >
> > perf record -e "{cpu_atom/branch-misses/ppu,\
> > cpu_atom/branch-instructions,ratio-to-prev=5/u}" -c 1000000 \
> > -- ./mispredict
> >
> > [1]
> > https://lore.kernel.org/lkml/20250327195217.2683619-1-kan.liang@linux.intel.com/
> >
> > Changes in v2 (mostly suggested by Ian Rogers):
> >
> > -- Add documentation explaining acr_mask bitmask used by ACR
> > -- Move ACR specific implementation to arch/x86/
> > -- Provide test cases for event parsing and perf record tests
> >
> > Thomas Falcon (2):
> > perf record: Add ratio-to-prev term
> > perf record: add auto counter reload parse and regression tests
> >
> > tools/perf/Documentation/intel-acr.txt | 53 ++++++++++++++++++
> > tools/perf/Documentation/perf-list.txt | 2 +
> > tools/perf/arch/x86/util/evsel.c | 53 ++++++++++++++++++
> > tools/perf/tests/parse-events.c | 54 ++++++++++++++++++
> > tools/perf/tests/shell/record.sh | 40 ++++++++++++++
> > tools/perf/util/evsel.c | 76
> > ++++++++++++++++++++++++++
> > tools/perf/util/evsel.h | 1 +
> > tools/perf/util/evsel_config.h | 1 +
> > tools/perf/util/parse-events.c | 22 ++++++++
> > tools/perf/util/parse-events.h | 3 +-
> > tools/perf/util/parse-events.l | 1 +
> > tools/perf/util/pmu.c | 3 +-
> > 12 files changed, 307 insertions(+), 2 deletions(-)
> > create mode 100644 tools/perf/Documentation/intel-acr.txt
> >
Powered by blists - more mailing lists