[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18f20d38-070c-4e17-bc90-cf7102e1e53d@linux.intel.com>
Date: Fri, 17 Oct 2025 15:25:14 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Ian Rogers <irogers@...gle.com>, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>, James Clark <james.clark@...aro.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Yoshihiro Furudera <fj5100bi@...itsu.com>, Howard Chu
<howardchu95@...il.com>, Thomas Falcon <thomas.falcon@...el.com>,
Andi Kleen <ak@...ux.intel.com>, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: Xudong Hao <xudong.hao@...el.com>
Subject: Re: [PATCH v1 3/3] perf parse-events: Add 'X' modifier to exclude an
event from being regrouped
On 8/26/2025 5:12 AM, Ian Rogers wrote:
> The function parse_events__sort_events_and_fix_groups is needed to fix
> uncore events like:
> ```
> $ perf stat -e '{data_read,data_write}' ...
> ```
> so that the multiple uncore PMUs have a group each of data_read and
> data_write events.
>
> The same function will perform architecture sorting and group fixing,
> in particular for Intel topdown/perf-metric events. Grouping multiple
> perf metric events together causes perf_event_open to fail as the
> group can only support one. This means command lines like:
> ```
> $ perf stat -e 'slots,slots' ...
> ```
> fail as the slots events are forced into a group together to try to
> satisfy the perf-metric event constraints.
>
> As the user may know better than
> parse_events__sort_events_and_fix_groups add a 'X' modifier to skip
> its regrouping behavior. This allows the following to succeed rather
> than fail on the second slots event being opened:
> ```
> $ perf stat -e 'slots,slots:X' -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 6,834,154,071 cpu_core/slots/ (50.13%)
> 5,548,629,453 cpu_core/slots/X (49.87%)
>
> 1.002634606 seconds time elapsed
> ```
>
> Reported-by: Xudong Hao <xudong.hao@...el.com>
> Reported-by: Dapeng Mi <dapeng1.mi@...ux.intel.com>
> Closes: https://lore.kernel.org/lkml/20250822082233.1850417-1-dapeng1.mi@linux.intel.com/
> Signed-off-by: Ian Rogers <irogers@...gle.com>
> ---
> tools/perf/Documentation/perf-list.txt | 1 +
> tools/perf/util/evsel.h | 1 +
> tools/perf/util/parse-events.c | 5 +++--
> tools/perf/util/parse-events.h | 1 +
> tools/perf/util/parse-events.l | 5 +++--
> 5 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
> index 28215306a78a..a5039d1614f9 100644
> --- a/tools/perf/Documentation/perf-list.txt
> +++ b/tools/perf/Documentation/perf-list.txt
> @@ -73,6 +73,7 @@ counted. The following modifiers exist:
> e - group or event are exclusive and do not share the PMU
> b - use BPF aggregration (see perf stat --bpf-counters)
> R - retire latency value of the event
> + X - don't regroup the event to match PMUs
>
> The 'p' modifier can be used for specifying how precise the instruction
> address should be. The 'p' modifier can be specified multiple times:
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index e927a3a4fe0e..03f9f22e3a0c 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -89,6 +89,7 @@ struct evsel {
> bool use_config_name;
> bool skippable;
> bool retire_lat;
> + bool dont_regroup;
> int bpf_fd;
> struct bpf_object *bpf_obj;
> struct list_head config_terms;
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 8282ddf68b98..43de19551c81 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -1892,6 +1892,8 @@ static int parse_events__modifier_list(struct parse_events_state *parse_state,
> evsel->bpf_counter = true;
> if (mod.retire_lat)
> evsel->retire_lat = true;
> + if (mod.dont_regroup)
> + evsel->dont_regroup = true;
> }
> return 0;
> }
> @@ -2188,13 +2190,12 @@ static int parse_events__sort_events_and_fix_groups(struct list_head *list)
> * Set the group leader respecting the given groupings and that
> * groups can't span PMUs.
> */
> - if (!cur_leader) {
> + if (!cur_leader || pos->dont_regroup) {
> cur_leader = pos;
> cur_leaders_grp = &pos->core;
> if (pos_force_grouped)
> force_grouped_leader = pos;
> }
> -
> cur_leader_pmu_name = cur_leader->group_pmu_name;
> if (strcmp(cur_leader_pmu_name, pos_pmu_name)) {
> /* PMU changed so the group/leader must change. */
> diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
> index 62dc7202e3ba..a5c5fc39fd6f 100644
> --- a/tools/perf/util/parse-events.h
> +++ b/tools/perf/util/parse-events.h
> @@ -216,6 +216,7 @@ struct parse_events_modifier {
> bool guest : 1; /* 'G' */
> bool host : 1; /* 'H' */
> bool retire_lat : 1; /* 'R' */
> + bool dont_regroup : 1; /* 'X' */
> };
>
> int parse_events__modifier_event(struct parse_events_state *parse_state, void *loc,
> diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
> index 2034590eb789..294e943bcdb4 100644
> --- a/tools/perf/util/parse-events.l
> +++ b/tools/perf/util/parse-events.l
> @@ -206,6 +206,7 @@ static int modifiers(struct parse_events_state *parse_state, yyscan_t scanner)
> CASE('e', exclusive);
> CASE('b', bpf);
> CASE('R', retire_lat);
> + CASE('X', dont_regroup);
> default:
> return PE_ERROR;
> }
> @@ -251,10 +252,10 @@ term_name {name_start}[a-zA-Z0-9_*?.\[\]!\-:]*
> quoted_name [\']{name_start}[a-zA-Z0-9_*?.\[\]!\-:,\.=]*[\']
> drv_cfg_term [a-zA-Z0-9_\.]+(=[a-zA-Z0-9_*?\.:]+)?
> /*
> - * If you add a modifier you need to update check_modifier().
> + * If you add a modifier you need to update modifiers().
> * Also, the letters in modifier_event must not be in modifier_bp.
> */
> -modifier_event [ukhpPGHSDIWebR]{1,16}
> +modifier_event [ukhpPGHSDIWebRX]{1,17}
> modifier_bp [rwx]{1,3}
> lc_type (L1-dcache|l1-d|l1d|L1-data|L1-icache|l1-i|l1i|L1-instruction|LLC|L2|dTLB|d-tlb|Data-TLB|iTLB|i-tlb|Instruction-TLB|branch|branches|bpu|btb|bpc|node)
> lc_op_result (load|loads|read|store|stores|write|prefetch|prefetches|speculative-read|speculative-load|refs|Re
Hi Ian,
It looks the "X" modifier only works for the cases without explicit group,
like this.
sudo ./perf record -e cpu/mem-stores/ppu,cpu/slots/uX -- sleep 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
Once there is an explicit group, the "X" modifier would not work and the
regroup still happens, e.g.,
sudo ./perf record -e '{cpu/mem-stores/ppu,cpu/slots/uX}' -- sleep 1
WARNING: events were regrouped to match PMUs
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
I suppose we should enhance the "X" modifier and make it work in 2nd case
as well. How's your idea?
Thanks,
Dapeng Mi
> ference|ops|access|misses|miss)
Powered by blists - more mailing lists