linux-kernel - Re: [PATCH v1 2/2] perf metricgroup: Don't early exit if no CPUID table exists

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAP-5=fXPsOuqr+4DZmscoWGr1VHHJci4jRGxbOtJyy_Mp9Oa9w@mail.gmail.com>
Date: Fri, 6 Feb 2026 16:14:27 -0800
From: Ian Rogers <irogers@...gle.com>
To: Leo Yan <leo.yan@....com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, James Clark <james.clark@...aro.org>, 
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 2/2] perf metricgroup: Don't early exit if no CPUID
 table exists

On Fri, Feb 6, 2026 at 2:16 PM Ian Rogers <irogers@...gle.com> wrote:
>
> On Fri, Feb 6, 2026 at 1:50 PM Leo Yan <leo.yan@....com> wrote:
> >
> > On Fri, Feb 06, 2026 at 10:57:54AM -0800, Ian Rogers wrote:
> >
> > [...]
> >
> > > > > >  bool metricgroup__has_metric_or_groups(const char *pmu, const char *metric_or_groups)
> > > > > >  {
> > > > > > -   const struct pmu_metrics_table *tables[2] = {
> > > > > > -           pmu_metrics_table__find(),
> > > > > > -           pmu_metrics_table__default(),
> > > > > > -   };
> > > > > > +   const struct pmu_metrics_table *table = pmu_metrics_table__find();
> > > > >
> > > > > Here should be:
> > > > >
> > > > >   const struct pmu_metrics_table *table = pmu_metrics_table__default();
> > > >
> > > > Or, I think you should not change metricgroup__has_metric_or_groups().
> > > > The change only in metricgroup__parse_groups() can fix my issue.
> > >
> > > This is wrong and the patch I tested I believe to be correct.
> >
> > It is very likely you are not working on the issue same as mine.
> >
> > > The table here is CPUID associated table and all the metric code will
> >
> > As you said, the key point is CPUID.  On my board, as no corresponding
> > json file to support the CPU variant, it fails to find any table based
> > on the CPUID and pmu_metrics_table__find() always returns NULL.
> >
> > The failure flow is:
> >
> >   add_default_events()
> >     if (!metricgroup__has_metric_or_groups(pmu, default_metricgroup_names[i]))
> >         continue;
> >      ^^^
> >      Always return false
> >
> > As a result, metricgroup__parse_groups() has no chance to be invoked
> > and no any events are added to the ev list.
> >
> > This is why I suggested to keep default table in
> > metricgroup__has_metric_or_groups(), it can rallback to find default
> > table when CPUID is not supported.
> >
> > > bottom out in metricgroup__for_each_metric where the CPUID table is
> > > the table parameter (its a parameter because of testing where we force
> > > the machine/model to "testarch"):
> > > https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/metricgroup.c?h=perf-tools-next#n430
> > > ```
> > > int metricgroup__for_each_metric(const struct pmu_metrics_table
> > > *table, pmu_metric_iter_fn fn, void *data)
> > > {
> > > ...
> > > const struct pmu_metrics_table *tables[2] = {
> > >     table,
> > >     pmu_metrics_table__default(),
> > > };
> > >
> > > for (size_t i = 0; i < ARRAY_SIZE(tables); i++) {
> > >     int ret;
> > >
> > >     if (!tables[i])
> > >         continue;
> > > ```
> > > If you make the table here the default table then you will process the
> > > default table twice in metricgroup__for_each_metric which will mean
> > > double metric look ups, etc.
> >
> > In my case, it will not have double metric issue, as
> > pmu_metrics_table__find() cannot find any table, a NULL table pointer
> > will be passed to metricgroup__for_each_metric(), then only
> > pmu_metrics_table__default() will be used.
>
> Which is the table you want to be used to such an extent that you are
> proposing changing this patch to put it twice in the tables array in
> the snippet of code above. Also note that the code skips the NULL,
> failed to look up table, which is why this patch just needs to avoid
> the early return on a missing table to address the problem of the
> default table not being considered.
>
> > As a side question, using a single CPUID to look up metrics can be
> > error-prone on asymmetric systems (such as Arm big.LITTLE).  Since only
> > the first CPU ID is used to search the metrics table, could this cause
> > issues for other CPU variants?
>
> There's logic to use the PMU for the cpumask and then to get the CPUID
> for the ARM CPU associated with the CPUs in the cpumask. It is likely
> bitrotten, but the problem has had some work on it in the past. It is
> untested, to my knowledge.

Ah, the issue is that metricgroup__has_metric_or_groups is using
pmu_metrics_table__for_each_metric and not
metricgroup__for_each_metric:
 ```
-       return pmu_metrics_table__for_each_metric(table,
-
metricgroup__has_metric_or_groups_callback,
-                                                 &data)
+       return metricgroup__for_each_metric(table,
+
metricgroup__has_metric_or_groups_callback,
+
```
I'll send this out.

Thanks,
Ian

> Thanks,
> Ian
>
> > Thanks,
> > Leo