linux-kernel - Re: [PATCH v1 08/22] perf jevents: Add set of common metrics based on default ones

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fWx6H9g1c63wmXHRsBfEYYQNQ3p1uBviAzMtchuGB7oog@mail.gmail.com>
Date: Thu, 6 Nov 2025 10:05:15 -0800
From: Ian Rogers <irogers@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Alexander Shishkin <alexander.shishkin@...ux.intel.com>, 
	Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>, 
	James Clark <james.clark@...aro.org>, Xu Yang <xu.yang_2@....com>, 
	Chun-Tse Shao <ctshao@...gle.com>, Thomas Richter <tmricht@...ux.ibm.com>, 
	Sumanth Korikkar <sumanthk@...ux.ibm.com>, Collin Funk <collin.funk1@...il.com>, 
	Thomas Falcon <thomas.falcon@...el.com>, Howard Chu <howardchu95@...il.com>, 
	Dapeng Mi <dapeng1.mi@...ux.intel.com>, Levi Yun <yeoreum.yun@....com>, 
	Yang Li <yang.lee@...ux.alibaba.com>, linux-kernel@...r.kernel.org, 
	linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v1 08/22] perf jevents: Add set of common metrics based on
 default ones

On Wed, Nov 5, 2025 at 10:22 PM Namhyung Kim <namhyung@...nel.org> wrote:
>
> On Fri, Oct 24, 2025 at 10:58:43AM -0700, Ian Rogers wrote:
> > Add support to getting a common set of metrics from a default
> > table. It simplifies the generation to add json metrics at the same
> > time. The metrics added are CPUs_utilized, cs_per_second,
> > migrations_per_second, page_faults_per_second, insn_per_cycle,
> > stalled_cycles_per_instruction, frontend_cycles_idle,
> > backend_cycles_idle, cycles_frequency, branch_frequency and
> > branch_miss_rate based on the shadow metric definitions.
> >
> > Following this change the default perf stat output on an alderlake looks like:
> > ```
> > $ perf stat -a -- sleep 1
> >
> >  Performance counter stats for 'system wide':
> >
> >     28,165,735,434      cpu-clock                        #   27.973 CPUs utilized
> >             23,220      context-switches                 #  824.406 /sec
> >                833      cpu-migrations                   #   29.575 /sec
> >             35,293      page-faults                      #    1.253 K/sec
> >        997,341,554      cpu_atom/instructions/           #    0.84  insn per cycle              (35.63%)
> >     11,197,053,736      cpu_core/instructions/           #    1.97  insn per cycle              (58.21%)
> >      1,184,871,493      cpu_atom/cycles/                 #    0.042 GHz                         (35.64%)
> >      5,676,692,769      cpu_core/cycles/                 #    0.202 GHz                         (58.22%)
> >        150,525,309      cpu_atom/branches/               #    5.344 M/sec                       (42.80%)
> >      2,277,232,030      cpu_core/branches/               #   80.851 M/sec                       (58.21%)
> >          5,248,575      cpu_atom/branch-misses/          #    3.49% of all branches             (42.82%)
> >         28,829,930      cpu_core/branch-misses/          #    1.27% of all branches             (58.22%)
> >                        (software)                 #    824.4 cs/sec  cs_per_second
> >              TopdownL1 (cpu_core)                 #     12.6 %  tma_bad_speculation
> >                                                   #     28.8 %  tma_frontend_bound       (66.57%)
> >              TopdownL1 (cpu_core)                 #     25.8 %  tma_backend_bound
> >                                                   #     32.8 %  tma_retiring             (66.57%)
> >                        (software)                 #   1253.1 faults/sec  page_faults_per_second
> >                                                   #      0.0 GHz  cycles_frequency       (42.80%)
> >                                                   #      0.2 GHz  cycles_frequency       (74.92%)
> >              TopdownL1 (cpu_atom)                 #     22.3 %  tma_bad_speculation
> >                                                   #     17.2 %  tma_retiring             (49.95%)
> >              TopdownL1 (cpu_atom)                 #     30.6 %  tma_backend_bound
> >                                                   #     29.8 %  tma_frontend_bound       (49.94%)
> >                        (cpu_atom)                 #      6.9 K/sec  branch_frequency     (42.89%)
> >                                                   #     80.5 K/sec  branch_frequency     (74.93%)
> >                                                   #     29.6 migrations/sec  migrations_per_second
> >                                                   #     28.0 CPUs  CPUs_utilized
> >                        (cpu_atom)                 #      0.8 instructions  insn_per_cycle  (42.91%)
> >                                                   #      2.0 instructions  insn_per_cycle  (75.14%)
> >                        (cpu_atom)                 #      3.8 %  branch_miss_rate         (35.75%)
> >                                                   #      1.2 %  branch_miss_rate         (66.86%)
> >
> >        1.007063529 seconds time elapsed
> > ```
> >
> > Signed-off-by: Ian Rogers <irogers@...gle.com>
> > ---
> >  .../arch/common/common/metrics.json           |  86 +++++++++++++
> >  tools/perf/pmu-events/empty-pmu-events.c      | 115 +++++++++++++-----
> >  tools/perf/pmu-events/jevents.py              |  21 +++-
> >  tools/perf/pmu-events/pmu-events.h            |   1 +
> >  tools/perf/util/metricgroup.c                 |  31 +++--
> >  5 files changed, 212 insertions(+), 42 deletions(-)
> >  create mode 100644 tools/perf/pmu-events/arch/common/common/metrics.json
> >
> > diff --git a/tools/perf/pmu-events/arch/common/common/metrics.json b/tools/perf/pmu-events/arch/common/common/metrics.json
> > new file mode 100644
> > index 000000000000..d1e37db18dc6
> > --- /dev/null
> > +++ b/tools/perf/pmu-events/arch/common/common/metrics.json
> > @@ -0,0 +1,86 @@
> > +[
> > +    {
> > +        "BriefDescription": "Average CPU utilization",
> > +        "MetricExpr": "(software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@...k\\-clock\\,name\\=task\\-clock@) / (duration_time * 1e9)",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "CPUs_utilized",
> > +        "ScaleUnit": "1CPUs",
> > +        "MetricConstraint": "NO_GROUP_EVENTS"
> > +    },
> > +    {
> > +        "BriefDescription": "Context switches per CPU second",
> > +        "MetricExpr": "(software@...text\\-switches\\,name\\=context\\-switches@ * 1e9) / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@...k\\-clock\\,name\\=task\\-clock@)",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "cs_per_second",
> > +        "ScaleUnit": "1cs/sec",
> > +        "MetricConstraint": "NO_GROUP_EVENTS"
> > +    },
> > +    {
> > +        "BriefDescription": "Process migrations to a new CPU per CPU second",
> > +        "MetricExpr": "(software@cpu\\-migrations\\,name\\=cpu\\-migrations@ * 1e9) / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@...k\\-clock\\,name\\=task\\-clock@)",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "migrations_per_second",
> > +        "ScaleUnit": "1migrations/sec",
> > +        "MetricConstraint": "NO_GROUP_EVENTS"
> > +    },
> > +    {
> > +        "BriefDescription": "Page faults per CPU second",
> > +        "MetricExpr": "(software@...e\\-faults\\,name\\=page\\-faults@ * 1e9) / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@...k\\-clock\\,name\\=task\\-clock@)",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "page_faults_per_second",
> > +        "ScaleUnit": "1faults/sec",
> > +        "MetricConstraint": "NO_GROUP_EVENTS"
> > +    },
> > +    {
> > +        "BriefDescription": "Instructions Per Cycle",
> > +        "MetricExpr": "instructions / cpu\\-cycles",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "insn_per_cycle",
> > +        "MetricThreshold": "insn_per_cycle < 1",
> > +        "ScaleUnit": "1instructions"
> > +    },
> > +    {
> > +        "BriefDescription": "Max front or backend stalls per instruction",
> > +        "MetricExpr": "max(stalled\\-cycles\\-frontend, stalled\\-cycles\\-backend) / instructions",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "stalled_cycles_per_instruction"
> > +    },
> > +    {
> > +        "BriefDescription": "Frontend stalls per cycle",
> > +        "MetricExpr": "stalled\\-cycles\\-frontend / cpu\\-cycles",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "frontend_cycles_idle",
> > +        "MetricThreshold": "frontend_cycles_idle > 0.1"
> > +    },
> > +    {
> > +        "BriefDescription": "Backend stalls per cycle",
> > +        "MetricExpr": "stalled\\-cycles\\-backend / cpu\\-cycles",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "backend_cycles_idle",
> > +        "MetricThreshold": "backend_cycles_idle > 0.2"
> > +    },
> > +    {
> > +        "BriefDescription": "Cycles per CPU second",
> > +        "MetricExpr": "cpu\\-cycles / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@...k\\-clock\\,name\\=task\\-clock@)",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "cycles_frequency",
> > +        "ScaleUnit": "1GHz",
> > +        "MetricConstraint": "NO_GROUP_EVENTS"
> > +    },
> > +    {
> > +        "BriefDescription": "Branches per CPU second",
> > +        "MetricExpr": "branches / (software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@...k\\-clock\\,name\\=task\\-clock@)",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "branch_frequency",
> > +        "ScaleUnit": "1000K/sec",
>
> Wouldn't it be "1000M/sec" ?

Agreed. Will fix in v2. The existing logic does multiple by 1e9 in one
place and then divide by 1e3 in another. It would be good if we could
do better units, based on metric value, but I'll leave that for
another day.

> > +        "MetricConstraint": "NO_GROUP_EVENTS"
> > +    },
> > +    {
> > +        "BriefDescription": "Branch miss rate",
> > +        "MetricExpr": "branch\\-misses / branches",
> > +        "MetricGroup": "Default",
> > +        "MetricName": "branch_miss_rate",
> > +        "MetricThreshold": "branch_miss_rate > 0.05",
>
> Is MetricThreshold evaluated before scaling?

Yep. Primarily to help share most of the events/calculation with the
metric being created. Fwiw, the 5% here comes from the existing
stat-shadow metric threshold.

Thanks,
Ian

> Thanks,
> Namhyung
>
>
> > +        "ScaleUnit": "100%"
> > +    }
> > +]