[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fU8LWyKrxn==ieO+dOCyvDML5Qnf23K=5bqHdHMvGx-eA@mail.gmail.com>
Date: Tue, 11 Mar 2025 08:14:49 -0700
From: Ian Rogers <irogers@...gle.com>
To: James Clark <james.clark@...aro.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, Robin Murphy <robin.murphy@....com>,
Leo Yan <leo.yan@....com>, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] perf list: Collapse similar events across PMUs
On Tue, Mar 11, 2025 at 7:13 AM James Clark <james.clark@...aro.org> wrote:
>
>
>
> On 07/03/2025 5:35 pm, Ian Rogers wrote:
> > On Fri, Mar 7, 2025 at 6:08 AM James Clark <james.clark@...aro.org> wrote:
> >>
> >>
> >>
> >> On 05/03/2025 9:40 pm, Ian Rogers wrote:
> >>> On Tue, Mar 4, 2025 at 5:50 AM James Clark <james.clark@...aro.org> wrote:
> >>>>
> >>>> Instead of showing multiple line items with the same event name and
> >>>> description, show a single line and concatenate all PMUs that this
> >>>> event can belong to.
> >>>>
> >>>> Don't do it for json output. Machine readable output doesn't need to be
> >>>> minimized, and changing the "Unit" field to a list type would break
> >>>> backwards compatibility.
> >>>>
> >>>> Before:
> >>>> $ perf list -v
> >>>> ...
> >>>> br_indirect_spec
> >>>> [Branch speculatively executed,indirect branch. Unit: armv8_cortex_a53]
> >>>> br_indirect_spec
> >>>> [Branch speculatively executed,indirect branch. Unit: armv8_cortex_a57]
> >>>>
> >>>> After:
> >>>>
> >>>> $ perf list -v
> >>>> ...
> >>>> br_indirect_spec
> >>>> [Branch speculatively executed,indirect branch. Unit: armv8_cortex_a53,armv8_cortex_a57]
> >>>>
> >>>> Signed-off-by: James Clark <james.clark@...aro.org>
> >>>> ---
> >>>> tools/perf/builtin-list.c | 2 ++
> >>>> tools/perf/util/pmus.c | 75 +++++++++++++++++++++++++++++++++++++-----
> >>>> tools/perf/util/print-events.h | 1 +
> >>>> 3 files changed, 70 insertions(+), 8 deletions(-)
> >>>>
> >>>> diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
> >>>> index fed482adb039..aacd7beae2a0 100644
> >>>> --- a/tools/perf/builtin-list.c
> >>>> +++ b/tools/perf/builtin-list.c
> >>>> @@ -516,6 +516,7 @@ int cmd_list(int argc, const char **argv)
> >>>> .print_event = default_print_event,
> >>>> .print_metric = default_print_metric,
> >>>> .skip_duplicate_pmus = default_skip_duplicate_pmus,
> >>>> + .collapse_events = true
> >>>
> >>> So collapse_events is put in the callbacks but isn't a callback. I
> >>> think skipping duplicates and collapsing are the same thing, I'd
> >>> prefer it if there weren't two terms for the same thing. If you prefer
> >>> collapsing as a name then default_skip_duplicate_pmus can be
> >>> default_collapse_pmus.
> >>>
> >>
> >> I can use the existing callback and rename it. That does have the effect
> >> of disabling this behavior in verbose mode which may be useful or may be
> >> unexpected to some people. But it seems pretty 50/50 so fewer callbacks
> >> are probably better.
> >>
> >>>> };
> >>>> const char *cputype = NULL;
> >>>> const char *unit_name = NULL;
> >>>> @@ -574,6 +575,7 @@ int cmd_list(int argc, const char **argv)
> >>>> .print_event = json_print_event,
> >>>> .print_metric = json_print_metric,
> >>>> .skip_duplicate_pmus = json_skip_duplicate_pmus,
> >>>> + .collapse_events = false
> >>>> };
> >>>> ps = &json_ps;
> >>>> } else {
> >>>> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> >>>> index 4d60bac2d2b9..cb1b14ade25b 100644
> >>>> --- a/tools/perf/util/pmus.c
> >>>> +++ b/tools/perf/util/pmus.c
> >>>> @@ -453,17 +453,50 @@ static int cmp_sevent(const void *a, const void *b)
> >>>> /* Order by PMU name. */
> >>>> if (as->pmu == bs->pmu)
> >>>> return 0;
> >>>> - return strcmp(as->pmu_name ?: "", bs->pmu_name ?: "");
> >>>> + ret = strcmp(as->pmu_name ?: "", bs->pmu_name ?: "");
> >>>> + if (ret)
> >>>> + return ret;
> >>>> +
> >>>> + /* Order by remaining displayed fields for purposes of deduplication later */
> >>>> + ret = strcmp(as->scale_unit ?: "", bs->scale_unit ?: "");
> >>>> + if (ret)
> >>>> + return ret;
> >>>> + ret = !!as->deprecated - !!bs->deprecated;
> >>>> + if (ret)
> >>>> + return ret;
> >>>> + ret = strcmp(as->desc ?: "", bs->desc ?: "");
> >>>> + if (ret)
> >>>> + return ret;
> >>>> + return strcmp(as->long_desc ?: "", bs->long_desc ?: "");
> >>>> }
> >>>>
> >>>> -static bool pmu_alias_is_duplicate(struct sevent *a, struct sevent *b)
> >>>> +enum dup_type {
> >>>> + UNIQUE,
> >>>> + DUPLICATE,
> >>>> + SAME_TEXT
> >>>> +};
> >>>> +
> >>>> +static enum dup_type pmu_alias_duplicate_type(struct sevent *a, struct sevent *b)
> >>>> {
> >>>> /* Different names -> never duplicates */
> >>>> if (strcmp(a->name ?: "//", b->name ?: "//"))
> >>>> - return false;
> >>>> + return UNIQUE;
> >>>> +
> >>>> + /* Duplicate PMU name and event name -> hide completely */
> >>>> + if (strcmp(a->pmu_name, b->pmu_name) == 0)
> >>>> + return DUPLICATE;
> >>>> +
> >>>> + /* Any other different display text -> not duplicate */
> >>>> + if (strcmp(a->topic ?: "", b->topic ?: "") ||
> >>>> + strcmp(a->scale_unit ?: "", b->scale_unit ?: "") ||
> >>>> + a->deprecated != b->deprecated ||
> >>>> + strcmp(a->desc ?: "", b->desc ?: "") ||
> >>>> + strcmp(a->long_desc ?: "", b->long_desc ?: "")) {
> >>>> + return UNIQUE;
> >>>> + }
> >>>>
> >>>> - /* Don't remove duplicates for different PMUs */
> >>>> - return strcmp(a->pmu_name, b->pmu_name) == 0;
> >>>> + /* Same display text but different PMU -> collapse */
> >>>> + return SAME_TEXT;
> >>>> }
> >>>>
> >>>> struct events_callback_state {
> >>>> @@ -501,6 +534,21 @@ static int perf_pmus__print_pmu_events__callback(void *vstate,
> >>>> return 0;
> >>>> }
> >>>>
> >>>> +static void concat_pmu_names(char *pmu_names, size_t size, const char *a, const char *b)
> >>>> +{
> >>>> + size_t len = strlen(pmu_names);
> >>>> + size_t added;
> >>>> +
> >>>> + if (len)
> >>>> + added = snprintf(pmu_names + len, size - len, ",%s", b);
> >>>> + else
> >>>> + added = snprintf(pmu_names, size, "%s,%s", a, b);
> >>>> +
> >>>> + /* Truncate with ... */
> >>>> + if (added > 0 && added + len >= size)
> >>>> + sprintf(pmu_names + size - 4, "...");
> >>>> +}
> >>>> +
> >>>> void perf_pmus__print_pmu_events(const struct print_callbacks *print_cb, void *print_state)
> >>>> {
> >>>> struct perf_pmu *pmu;
> >>>> @@ -510,6 +558,7 @@ void perf_pmus__print_pmu_events(const struct print_callbacks *print_cb, void *p
> >>>> struct events_callback_state state;
> >>>> bool skip_duplicate_pmus = print_cb->skip_duplicate_pmus(print_state);
> >>>> struct perf_pmu *(*scan_fn)(struct perf_pmu *);
> >>>> + char pmu_names[128] = {0};
> >>>>
> >>>> if (skip_duplicate_pmus)
> >>>> scan_fn = perf_pmus__scan_skip_duplicates;
> >>>> @@ -539,12 +588,21 @@ void perf_pmus__print_pmu_events(const struct print_callbacks *print_cb, void *p
> >>>> qsort(aliases, len, sizeof(struct sevent), cmp_sevent);
> >>>> for (int j = 0; j < len; j++) {
> >>>> /* Skip duplicates */
> >>>> - if (j < len - 1 && pmu_alias_is_duplicate(&aliases[j], &aliases[j + 1]))
> >>>> - goto free;
> >>>> + if (j < len - 1) {
> >>>> + enum dup_type dt = pmu_alias_duplicate_type(&aliases[j], &aliases[j + 1]);
> >>>> +
> >>>> + if (dt == DUPLICATE) {
> >>>> + goto free;
> >>>> + } else if (print_cb->collapse_events && dt == SAME_TEXT) {
> >>>> + concat_pmu_names(pmu_names, sizeof(pmu_names),
> >>>> + aliases[j].pmu_name, aliases[j+1].pmu_name);
> >>>> + goto free;
> >>>> + }
> >>>> + }
> >>>
> >>> I think a label called 'free' is a little unfortunate given the
> >>> function called free.
> >>> When I did things with sevent I was bringing over previous `perf list`
> >>> code mainly so that the perf list output before and after the changes
> >>> was identical. I wonder if this logic would be better placed in the
> >>> callback like default_print_event which already maintains state
> >>> (last_topic) to optionally print different things. This may better
> >>> encapsulate the behavior rather than the logic being in the PMU code.
> >>>
> >>
> >> Yeah I can have a look at putting it in one of the callbacks. But in the
> >> end builtin-list.c is the only user of perf_pmus__print_pmu_events()
> >> anyway so makes you wonder if the whole thing can be moved to
> >> builtin-list and open coded without the callbackyness.
> >
> > I wanted to hide some of the innards of pmus, so I think that's the
> > reason it is as it is. The whole `sevent` was pre-existing and
> > maintained so that the output was the same.
> >
>
> Looking at this a bit more it is possible to move all of
> perf_pmus__print_pmu_events() (including its dependencies and perf list
> specifics) out of pmus.c into print-events.c without exposing any new
> innards of pmus. Only struct pmu_event_info would be used, but that's
> not private and is already used elsewhere.
>
> It's difficult to do this change only in the print_event callback
> because it relies on having the _next_ event, not the previous event.
> We're already tracking last_topic, and if we start passing all the
> next_* items too it gets a bit unmanageable.
>
> If it's all moved then doing display specific processing across the
> whole list becomes quite easy.
I'm not sure I follow all of this. If things can be moved to a more
obvious location, printing code in print-events.c, and we maintain
encapsulation then that sounds great to me. I'm not clear on the next
event issue. My hope with the print routines in builtin-list.c was
that anyone could come along and add another for whatever rendering
took their fancy. I didn't want it to be like the logic in
stat-display.c, where there are multiple levels of call backs, global
state, odd things like passing NULL meaning display column headers,
and the whole thing being a confusing rats nest where a change nearly
always ripples through and breaks something somewhere. Particularly
compare the json output in stat-display.c and builtin-list.c, my hope
was that builtin-list.c would be the model for reimplementing the
stat-display.c one. Others may have different opinions.
Thanks,
Ian
Powered by blists - more mailing lists