[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fUrB3gJYaCUFpizC5dtOCUdbCqiTFvWvf6be-2YGFM7QQ@mail.gmail.com>
Date: Sat, 20 Sep 2025 07:04:56 -0700
From: Ian Rogers <irogers@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>, Kan Liang <kan.liang@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, linux-perf-users@...r.kernel.org
Subject: Re: [RFC/PATCH 2/2] perf check: Add 'pmu' subcommand
On Fri, Sep 19, 2025 at 1:27 PM Namhyung Kim <namhyung@...nel.org> wrote:
>
> On Thu, Sep 18, 2025 at 08:33:30AM -0700, Ian Rogers wrote:
> > On Wed, Sep 17, 2025 at 11:39 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > >
> > > The 'perf check pmu' command will show information about PMUs in the
> > > system like below:
> > >
> > > $ perf check pmu | head
> > > cpu_atom: type = 10
> > > core PMU: cpus = 16-27
> > > caps: branches = 32
> > > caps: max_precise = 3
> > > caps: pmu_name = alderlake_hybrid
> > > cpu_core: type = 4
> > > core PMU: cpus = 0-15
> > > caps: branches = 32
> > > caps: max_precise = 3
> > > caps: pmu_name = alderlake_hybrid
> > >
> > > The -q option will make it print the name and type of PMUs only. It
> > > also takes arguments to show matching PMUs only.
> > >
> > > $ perf check pmu -q uncore
> > > uncore_arb_0: type = 27
> > > uncore_arb_1: type = 28
> > > uncore_cbox_0: type = 15
> > > uncore_cbox_1: type = 16
> > > uncore_cbox_2: type = 17
> > > uncore_cbox_3: type = 18
> > > uncore_cbox_4: type = 19
> > > uncore_cbox_5: type = 20
> > > uncore_cbox_6: type = 21
> > > uncore_cbox_7: type = 22
> > > uncore_cbox_8: type = 23
> > > uncore_cbox_9: type = 24
> > > uncore_cbox_10: type = 25
> > > uncore_cbox_11: type = 26
> > > uncore_clock: type = 29
> > > uncore_imc_free_running_0: type = 32
> > > uncore_imc_free_running_1: type = 33
> > > uncore_imc_0: type = 30
> > > uncore_imc_1: type = 31
> > >
> > > Signed-off-by: Namhyung Kim <namhyung@...nel.org>
> > > ---
> > > tools/perf/Documentation/perf-check.txt | 12 +++++
> > > tools/perf/builtin-check.c | 60 ++++++++++++++++++++++++-
> > > 2 files changed, 71 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tools/perf/Documentation/perf-check.txt b/tools/perf/Documentation/perf-check.txt
> > > index 34dccc29d90d2fdf..2fa007698c0f0814 100644
> > > --- a/tools/perf/Documentation/perf-check.txt
> > > +++ b/tools/perf/Documentation/perf-check.txt
> > > @@ -11,6 +11,7 @@ SYNOPSIS
> > > 'perf check' [<options>]
> > > 'perf check' {feature <feature_list>} [<options>]
> > > 'perf check' {system <setting_list>} [<options>]
> > > +'perf check' {pmu <name_list>} [<options>]
> > >
> > > DESCRIPTION
> > > -----------
> > > @@ -26,6 +27,9 @@ is built-in, otherwise returns with exit status 1.
> > > If the subcommand 'system' is used, then system settins are printed on
> > > the standard output with explanation.
> > >
> > > +If the subcommand 'pmu' is used, then available PMU information is printed
> > > +on the standard output with explanation.
> > > +
> > > SUBCOMMANDS
> > > -----------
> > >
> > > @@ -88,6 +92,14 @@ SUBCOMMANDS
> > > nmi_watchdog
> > > kptr_restrict
> > >
> > > +pmu::
> > > +
> > > + Print PMU information available in the system.
> > > +
> > > + Example Usage:
> > > + perf check pmu
> > > + perf check pmu <name>
> > > +
> > >
> > > OPTIONS
> > > -------
> > > diff --git a/tools/perf/builtin-check.c b/tools/perf/builtin-check.c
> > > index 1c7c6bb5da5ccbf2..f3d034a94b1227c7 100644
> > > --- a/tools/perf/builtin-check.c
> > > +++ b/tools/perf/builtin-check.c
> > > @@ -3,6 +3,8 @@
> > > #include "color.h"
> > > #include "util/debug.h"
> > > #include "util/header.h"
> > > +#include "util/pmu.h"
> > > +#include "util/pmus.h"
> > > #include <api/fs/fs.h>
> > > #include <tools/config.h>
> > > #include <stdbool.h>
> > > @@ -10,13 +12,14 @@
> > > #include <string.h>
> > > #include <subcmd/parse-options.h>
> > >
> > > -static const char * const check_subcommands[] = { "feature", "system", NULL };
> > > +static const char * const check_subcommands[] = { "feature", "system", "pmu", NULL };
> > > static struct option check_options[] = {
> > > OPT_BOOLEAN('q', "quiet", &quiet, "do not show any warnings or messages"),
> > > OPT_END()
> > > };
> > > static struct option check_feature_options[] = { OPT_PARENT(check_options) };
> > > static struct option check_system_options[] = { OPT_PARENT(check_options) };
> > > +static struct option check_pmu_options[] = { OPT_PARENT(check_options) };
> > >
> > > static const char *check_usage[] = { NULL, NULL };
> > > static const char *check_feature_usage[] = {
> > > @@ -27,6 +30,10 @@ static const char *check_system_usage[] = {
> > > "perf check system",
> > > NULL
> > > };
> > > +static const char *check_pmu_usage[] = {
> > > + "perf check pmu",
> > > + NULL
> > > +};
> > >
> > > #define FEATURE_STATUS(name_, macro_) { \
> > > .name = name_, \
> > > @@ -278,6 +285,55 @@ static int subcommand_system(int argc, const char **argv)
> > > return 0;
> > > }
> > >
> > > +/**
> > > + * Usage: 'perf check pmu <names>'
> > > + *
> > > + * Show PMU information.
> > > + */
> > > +static int subcommand_pmu(int argc, const char **argv)
> > > +{
> > > + struct perf_pmu *pmu = NULL;
> > > + struct perf_pmu_caps *caps;
> > > +
> > > + argc = parse_options(argc, argv, check_pmu_options, check_pmu_usage, 0);
> > > +
> > > + while ((pmu = perf_pmus__scan(pmu)) != NULL) {
> > > + if (argc) {
> > > + bool found = false;
> > > +
> > > + /* only show entries match to command line arguments */
> > > + for (int k = 0; k < argc; k++) {
> > > + if (strstr(pmu->name, argv[k])) {
> >
> > To avoid divergence from parse-events it may be better to use
> > perf_pmu__wildcard_match? There is also
> > perf_pmus__scan_matching_wildcard.
>
> Sounds good, will change.
>
> >
> > > + found = true;
> > > + break;
> > > + }
> > > + }
> > > + if (!found)
> > > + continue;
> > > + }
> > > +
> > > + printf("%s: type = %u\n", pmu->name, pmu->type);
> > > + if (quiet)
> > > + continue;
> > > +
> > > + if (pmu->is_core || pmu->is_uncore) {
> > > + printf(" %score PMU", pmu->is_uncore ? "un" : "");
> > > + if (!perf_cpu_map__is_empty(pmu->cpus)) {
> > > + printf(": %s = ", pmu->is_core ? "cpus" : "cpumask");
> >
> > The cpus file is only generally present on hybrid systems. I don't
> > think this needs to be core/uncore dependent - the only thing I've
> > ever seen fail that test are some ARM uncore PMUs. I'd always dump
> > this as cpumask, when !core and !uncore then the default all online
> > CPUs cpumask is important information as that probably isn't the
> > cpumask you want with an uncore PMU.
>
> Ok, I can remove the core/uncore check. But I think it's better to
> print them only if they actually have maps to inform users that those
> PMUs need some more care. :)
So we have to deal with a bit of history in the code:
1. if there is a cpumask file then this is an uncore PMU and the
cpumask is the one given.
2. if there is a cpus file then this is a core PMU typically a hybrid one.
3. if there is no cpus file then all online CPUs are used in the
cpumask as this is probably a core PMU. Being a core PMU is only set
in the case of the PMU name being "cpu" or "cpum_cf" or for case 2.
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmu.c?h=perf-tools-next#n1966
So there is the broken PMU case where no cpumask is provided for an
uncore PMU. Historically we treat that as having a cpumask of all
online CPUs.
There are other workaround case the perf code deals with:
- the cpumasks are rewritten for the benefit of NUMA support when
there is sub-NUMA clustering:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/arch/x86/util/pmu.c?h=perf-tools-next#n309
- on ARM there has been a history of broken cpumasks (note on ARM
both arm and arm64 arch directories get into the perf build):
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/arch/arm/util/pmu.c?h=perf-tools-next#n43
There was an unresolved bug on ARM PMUs where CPU hot plugging
wouldn't update the cpumask and so perf would request typically uncore
events be opened on a CPU that was offline. A driver fix for that was
sent. IIRC Will Deacon had concerns that the kernel's perf API was
racy, a CPU hot plug could happen immediately after the read, and so
that never landed. I believe this means that if you offline a CPU that
is in an ARM uncore cpumask the intersection computation above will
make the cpumask empty and so we therefore open the event on all the
online CPUs. Doing that will likely cause multiplexing, aggregation of
the same count many times and a massive miscalculation of the uncore
event's count.
I therefore believe, with the caveats above, that the cpumask on
perf's internal pmu can never be empty. As it is a complicated thing
that has real implications for how events get opened, I think it is
best to just always print it.
Thanks,
Ian
> >
> > > + cpu_map__fprintf(pmu->cpus, stdout);
> > > + } else {
> > > + printf("\n");
> > > + }
> > > + }
> > > +
> > > + perf_pmu__caps_parse(pmu);
> > > + list_for_each_entry(caps, &pmu->caps, list) {
> > > + printf(" caps: %s = %s\n", caps->name, caps->value);
> > > + }
> > > + }
> > > + return 0;
> >
> > If not found return an error?
>
> Yep, good idea.
>
> Thanks for the review!
> Namhyung
>
> >
> > > +}
> > > +
> > > int cmd_check(int argc, const char **argv)
> > > {
> > > argc = parse_options_subcommand(argc, argv, check_options,
> > > @@ -290,6 +346,8 @@ int cmd_check(int argc, const char **argv)
> > > return subcommand_feature(argc, argv);
> > > if (strcmp(argv[0], "system") == 0)
> > > return subcommand_system(argc, argv);
> > > + if (strcmp(argv[0], "pmu") == 0)
> > > + return subcommand_pmu(argc, argv);
> > >
> > > /* If no subcommand matched above, print usage help */
> > > pr_err("Unknown subcommand: %s\n", argv[0]);
> > > --
> > > 2.51.0
> > >
Powered by blists - more mailing lists