[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fUFzr3PymeSaaO=c-OnvhozoORFmeXmwWdqFhbVqknT3Q@mail.gmail.com>
Date: Wed, 15 May 2024 10:48:41 -0700
From: Ian Rogers <irogers@...gle.com>
To: John Garry <john.g.garry@...cle.com>
Cc: Jia He <justin.he@....com>, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>,
James Clark <james.clark@....com>, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] perf pmu: Count sys and cpuid json events separately
On Wed, May 15, 2024 at 9:09 AM John Garry <john.g.garry@...cle.com> wrote:
>
> On 10/05/2024 18:36, Ian Rogers wrote:
> > Sys events are eagerly loaded as each event has a compat option that
> > may mean the event is or isn't associated with the PMU. These
> > shouldn't be counted as loaded_json_events as that is used for json
> > events matching the CPUID that may or may not have been loaded. The
> > mismatch causes issues on ARM64 that uses sys events.
> >
> > Reported-by: Jia He <justin.he@....com>
> > Closes: https://urldefense.com/v3/__https://lore.kernel.org/lkml/20240510024729.1075732-1-justin.he@arm.com/__;!!ACWV5N9M2RV99hQ!KVhaRPLqS7T1s6p506Gv0zFNdlTR1exCUrXM3UIDr0EdrRFwrzM3W9ntkxw42jNeG01P7H78r0c-nz8FVQQ$
> > Fixes: e6ff1eed3584 ("perf pmu: Lazily add JSON events")
> > Signed-off-by: Ian Rogers <irogers@...gle.com>
> > ---
> > tools/perf/util/pmu.c | 70 ++++++++++++++++++++++++++++++-------------
> > tools/perf/util/pmu.h | 6 ++--
> > 2 files changed, 53 insertions(+), 23 deletions(-)
> >
> > diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> > index b3b072feef02..888ce9912275 100644
> > --- a/tools/perf/util/pmu.c
> > +++ b/tools/perf/util/pmu.c
> > @@ -36,6 +36,18 @@ struct perf_pmu perf_pmu__fake = {
> >
> > #define UNIT_MAX_LEN 31 /* max length for event unit name */
> >
> > +enum event_source {
> > + /* An event loaded from /sys/devices/<pmu>/events. */
> > + EVENT_SRC_SYSFS,
> > + /* An event loaded from a CPUID matched json file. */
> > + EVENT_SRC_CPU_JSON,
> > + /*
> > + * An event loaded from a /sys/devices/<pmu>/identifier matched json
> > + * file.
> > + */
> > + EVENT_SRC_SYS_JSON,
>
> If a json sys event aliases to a EVENT_SRC_SYSFS event, this is
> considered a EVENT_SRC_SYS_JSON event source, right?
Thanks John. The current use for this enum is just in
perf_pmu__new_alias to say which source is requesting the new
alias/event for the lazy loading book keeping. It isn't held in the
perf_pmu_alias as events may appear in both sysfs and json and we
update in that case.
> > +};
> > +
> > /**
> > * struct perf_pmu_alias - An event either read from sysfs or builtin in
> > * pmu-events.c, created by parsing the pmu-events json files.
> > @@ -521,7 +533,7 @@ static int update_alias(const struct pmu_event *pe,
> >
> > static int perf_pmu__new_alias(struct perf_pmu *pmu, const char *name,
> > const char *desc, const char *val, FILE *val_fd,
> > - const struct pmu_event *pe)
> > + const struct pmu_event *pe, enum event_source src)
> > {
> > struct perf_pmu_alias *alias;
> > int ret;
> > @@ -574,25 +586,30 @@ static int perf_pmu__new_alias(struct perf_pmu *pmu, const char *name,
> > }
> > snprintf(alias->unit, sizeof(alias->unit), "%s", unit);
> > }
> > - if (!pe) {
> > - /* Update an event from sysfs with json data. */
> > - struct update_alias_data data = {
> > - .pmu = pmu,
> > - .alias = alias,
> > - };
> > -
> > + switch (src) {
> > + default:
> > + case EVENT_SRC_SYSFS:
> > alias->from_sysfs = true;
> > if (pmu->events_table) {
> > + /* Update an event from sysfs with json data. */
> > + struct update_alias_data data = {
> > + .pmu = pmu,
> > + .alias = alias,
> > + };
> > if (pmu_events_table__find_event(pmu->events_table, pmu, name,
> > update_alias, &data) == 0)
> > - pmu->loaded_json_aliases++;
> > + pmu->cpu_json_aliases++;
>
> seems strange that matching for EVENT_SRC_SYSFS increases
> pmu->cpu_json_aliases
Yep. There's a comment at the start of the "if", we're trying to
update a srcfs event with json data as the pmu->events_table contained
the event. As the json event won't be used, it just updates the srcfs
event, we need to account for that when giving the number of
events/aliases.
> > }
> > - }
> > -
> > - if (!pe)
> > pmu->sysfs_aliases++;
> > - else
> > - pmu->loaded_json_aliases++;
> > + break;
> > + case EVENT_SRC_CPU_JSON:
> > + pmu->cpu_json_aliases++;
> > + break;
> > + case EVENT_SRC_SYS_JSON:
> > + pmu->sys_json_aliases++;
> > + break;
> > +
> > + }
> > list_add_tail(&alias->list, &pmu->aliases);
> > return 0;
> > }
> > @@ -653,7 +670,8 @@ static int __pmu_aliases_parse(struct perf_pmu *pmu, int events_dir_fd)
> > }
> >
> > if (perf_pmu__new_alias(pmu, name, /*desc=*/ NULL,
> > - /*val=*/ NULL, file, /*pe=*/ NULL) < 0)
> > + /*val=*/ NULL, file, /*pe=*/ NULL,
> > + EVENT_SRC_SYSFS) < 0)
> > pr_debug("Cannot set up %s\n", name);
> > fclose(file);
> > }
> > @@ -946,7 +964,8 @@ static int pmu_add_cpu_aliases_map_callback(const struct pmu_event *pe,
> > {
> > struct perf_pmu *pmu = vdata;
> >
> > - perf_pmu__new_alias(pmu, pe->name, pe->desc, pe->event, /*val_fd=*/ NULL, pe);
> > + perf_pmu__new_alias(pmu, pe->name, pe->desc, pe->event, /*val_fd=*/ NULL,
> > + pe, EVENT_SRC_CPU_JSON);
> > return 0;
> > }
> >
> > @@ -981,13 +1000,14 @@ static int pmu_add_sys_aliases_iter_fn(const struct pmu_event *pe,
> > return 0;
> >
> > if (pmu_uncore_alias_match(pe->pmu, pmu->name) &&
> > - pmu_uncore_identifier_match(pe->compat, pmu->id)) {
> > + pmu_uncore_identifier_match(pe->compat, pmu->id)) {
> > perf_pmu__new_alias(pmu,
> > pe->name,
> > pe->desc,
> > pe->event,
> > /*val_fd=*/ NULL,
> > - pe);
> > + pe,
> > + EVENT_SRC_SYS_JSON);
> > }
> >
> > return 0;
> > @@ -1082,6 +1102,12 @@ struct perf_pmu *perf_pmu__lookup(struct list_head *pmus, int dirfd, const char
> > pmu->max_precise = pmu_max_precise(dirfd, pmu);
> > pmu->alias_name = pmu_find_alias_name(pmu, dirfd);
> > pmu->events_table = perf_pmu__find_events_table(pmu);
> > + /*
> > + * Load the sys json events/aliases when loading the PMU as each event
> > + * may have a different compat regular expression. We therefore can't
> > + * know the number of sys json events/aliases without computing the
> > + * regular expressions for them all.
> > + */
> > pmu_add_sys_aliases(pmu);
> > list_add_tail(&pmu->list, pmus);
> >
> > @@ -1739,12 +1765,14 @@ size_t perf_pmu__num_events(struct perf_pmu *pmu)
> > size_t nr;
> >
> > pmu_aliases_parse(pmu);
> > - nr = pmu->sysfs_aliases;
> > + nr = pmu->sysfs_aliases + pmu->sys_json_aliases;;
> >
> > if (pmu->cpu_aliases_added)
> > - nr += pmu->loaded_json_aliases;
> > + nr += pmu->cpu_json_aliases;
> > else if (pmu->events_table)
> > - nr += pmu_events_table__num_events(pmu->events_table, pmu) - pmu->loaded_json_aliases;
> > + nr += pmu_events_table__num_events(pmu->events_table, pmu) - pmu->cpu_json_aliases;
> > + else
> > + assert(pmu->cpu_json_aliases == 0);
> >
> > return pmu->selectable ? nr + 1 : nr;
> > }
> > diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
> > index 561716aa2b25..b2d3fd291f02 100644
> > --- a/tools/perf/util/pmu.h
> > +++ b/tools/perf/util/pmu.h
> > @@ -123,8 +123,10 @@ struct perf_pmu {
> > const struct pmu_events_table *events_table;
> > /** @sysfs_aliases: Number of sysfs aliases loaded. */
> > uint32_t sysfs_aliases;
> > - /** @sysfs_aliases: Number of json event aliases loaded. */
> > - uint32_t loaded_json_aliases;
> > + /** @cpu_json_aliases: Number of json event aliases loaded specific to the CPUID. */
> > + uint32_t cpu_json_aliases;
>
> it seems strange that we allow a pmu to have both cpu_json_aliases and
> sys_json_aliases count. A union could be used, but that would complicate
> things.
Yeah. There are a bunch of things the currently sys json allows that
we may want to change. Lazy loading has made "perf bench internals
pmu-scan" about 10 times faster, this matters on large servers that
may be getting on for 100s of particularly uncore PMUs. Sys json
events don't allow lazy loading as each json event needs its Compat
field checking against the PMU's identifier file. We're paying a cost
for sys json events in cases where the events are never used. If we
keep this behavior we could at least ahead of time compile the regular
expressions in the sys event json using lex, we should do similar for
CPUIDs. Anyway, I think there's lots of room for improvements. The
focus in this change was on correctness and I'm not worried about the
possible 4 byte per PMU saving here.
Thanks,
Ian
>
> > + /** @sys_json_aliases: Number of json event aliases loaded matching the PMU's identifier. */
> > + uint32_t sys_json_aliases;
> > /** @sysfs_aliases_loaded: Are sysfs aliases loaded from disk? */
> > bool sysfs_aliases_loaded;
> > /**
>
Powered by blists - more mailing lists