linux-kernel - Re: [PATCH] perf tools topdown: Fix incorrect topdown slots regroup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fW6N+eE0KTyV7F4wm=KBwk46QbXjwwG9POtZxEDhbRqRQ@mail.gmail.com>
Date: Fri, 22 Aug 2025 10:35:45 -0700
From: Ian Rogers <irogers@...gle.com>
To: Dapeng Mi <dapeng1.mi@...ux.intel.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Kan Liang <kan.liang@...ux.intel.com>, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Dapeng Mi <dapeng1.mi@...el.com>, Xudong Hao <xudong.hao@...el.com>
Subject: Re: [PATCH] perf tools topdown: Fix incorrect topdown slots regroup

On Fri, Aug 22, 2025 at 1:23 AM Dapeng Mi <dapeng1.mi@...ux.intel.com> wrote:
>
> When running the command "perf stat -e "slots,slots" -C0 sleep 1", we
> see below error.
>
> perf stat -e "slots,slots" -C0 sleep 1
> WARNING: events were regrouped to match PMUs
>  Performance counter stats for 'CPU(s) 0':
>      <not counted>      slots
>    <not supported>      slots
>
>      1.002265769 seconds time elapsed
>
> The topdown slots events are not correctly counted. The root cause is
> that perf tools incorrectly regroup these 2 slots events into a group.
> If there are only topdown slots events but no topdown metrics events,
> the regroup should not be done since topdown slots event can only be
> programed on fixed counter 3 and multiple slots events can only be
> multiplexed to run on fixed counter 3, but grouping them blocks
> multiplexing.
>
> So avoid to regroup topdown slots events if there is no topdown metrics
> events.
>
> With this change, above command can be run successfully.
>
> perf stat -e "slots,slots" -C0 sleep 1
>  Performance counter stats for 'CPU(s) 0':
>        103,973,791      slots
>        106,488,170      slots
>
>        1.003517284 seconds time elapsed
>
> Besides, run perf stats/record test on SPR and PTL, both passed.
>
> Reported-by: Xudong Hao <xudong.hao@...el.com>
> Signed-off-by: Dapeng Mi <dapeng1.mi@...ux.intel.com>

I don't think we should do this and if we were to do it we shouldn't
do it in the common code. The perf metrics requiring a slots event is
a massive mess that never seems to end. What should we do with:
```
$ perf stat -e "topdown-fe-bound,topdown-fe-bound" true

Performance counter stats for 'true':

    <not counted>      slots
    <not counted>      topdown-fe-bound
  <not supported>      topdown-fe-bound

      0.000960472 seconds time elapsed

      0.001060000 seconds user
      0.000000000 seconds sys


Some events weren't counted. Try disabling the NMI watchdog:
       echo 0 > /proc/sys/kernel/nmi_watchdog
       perf stat ...
       echo 1 > /proc/sys/kernel/nmi_watchdog
```

We've injected the slots event but the repeated topdown-fe-bound
causes the group to fail in a similar way. Why is repeating slots a
case we care about more?
Were we to say, okay slots is more special than the other perf metric
events then I'd prefer arch_evsel__must_be_in_group to return false
for the slots event when there are no other perf metric events in the
evlist. But then what do you do if the slots event is in a different
group like:
```
$ perf stat -e "slots,{slots,topdown-fe-bound}" true
```
It is pretty easy to teach the code to deduplicate events, but then
again, what about the group behavior?
It is not clear to me we can ever clean up all the mess that the perf
metric events on p-cores create and I'm in favor of having the code be
no more complex than it needs to be, so I'm not sure we should be
solving this problem.

Thanks,
Ian

> ---
>  tools/perf/arch/x86/util/topdown.h |  2 --
>  tools/perf/util/evsel.c            | 10 ++++++++++
>  tools/perf/util/evsel.h            |  2 ++
>  tools/perf/util/parse-events.c     | 11 +++++++++++
>  4 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/arch/x86/util/topdown.h b/tools/perf/arch/x86/util/topdown.h
> index 69035565e649..6a917b2066f7 100644
> --- a/tools/perf/arch/x86/util/topdown.h
> +++ b/tools/perf/arch/x86/util/topdown.h
> @@ -8,8 +8,6 @@ struct evsel;
>  struct list_head;
>
>  bool topdown_sys_has_perf_metrics(void);
> -bool arch_is_topdown_slots(const struct evsel *evsel);
> -bool arch_is_topdown_metrics(const struct evsel *evsel);
>  int topdown_insert_slots_event(struct list_head *list, int idx, struct evsel *metric_event);
>
>  #endif
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index d264c143b592..6aaae1ac026e 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -3965,6 +3965,16 @@ int evsel__source_count(const struct evsel *evsel)
>         return count;
>  }
>
> +bool __weak arch_is_topdown_slots(const struct evsel *evsel __maybe_unused)
> +{
> +       return false;
> +}
> +
> +bool __weak arch_is_topdown_metrics(const struct evsel *evsel __maybe_unused)
> +{
> +       return false;
> +}
> +
>  bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused)
>  {
>         return false;
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 5797a02e5d6a..33f8aab675a9 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -556,6 +556,8 @@ void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
>  int evsel__source_count(const struct evsel *evsel);
>  void evsel__remove_from_group(struct evsel *evsel, struct evsel *leader);
>
> +bool arch_is_topdown_slots(const struct evsel *evsel);
> +bool arch_is_topdown_metrics(const struct evsel *evsel);
>  bool arch_evsel__must_be_in_group(const struct evsel *evsel);
>
>  bool evsel__set_needs_uniquify(struct evsel *counter, const struct perf_stat_config *config);
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 8282ddf68b98..bd09fc47ea90 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -2127,6 +2127,8 @@ static int parse_events__sort_events_and_fix_groups(struct list_head *list)
>         int ret;
>         struct evsel *force_grouped_leader = NULL;
>         bool last_event_was_forced_leader = false;
> +       bool has_slots = false;
> +       bool has_metrics = false;
>
>         /* On x86 topdown metrics events require a slots event. */
>         ret = arch_evlist__add_required_events(list);
> @@ -2147,6 +2149,11 @@ static int parse_events__sort_events_and_fix_groups(struct list_head *list)
>                 if (pos == pos_leader)
>                         orig_num_leaders++;
>
> +               if (!has_slots)
> +                       has_slots = arch_is_topdown_slots(pos);
> +               if (!has_metrics)
> +                       has_metrics = arch_is_topdown_metrics(pos);
> +
>                 /*
>                  * Ensure indexes are sequential, in particular for multiple
>                  * event lists being merged. The indexes are used to detect when
> @@ -2163,6 +2170,10 @@ static int parse_events__sort_events_and_fix_groups(struct list_head *list)
>                         force_grouped_idx = pos_leader->core.idx;
>         }
>
> +       /* Don't regroup if there are only topdown slots events. */
> +       if (force_grouped_idx != -1 && has_slots && !has_metrics)
> +               force_grouped_idx = -1;
> +
>         /* Sort events. */
>         list_sort(&force_grouped_idx, list, evlist__cmp);
>
>
> base-commit: 6235ce77749f45cac27f630337e2fdf04e8a6c73
> --
> 2.34.1
>