lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aRvNux4vlacfrgin@google.com>
Date: Mon, 17 Nov 2025 17:36:59 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: James Clark <james.clark@...aro.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Xu Yang <xu.yang_2@....com>, Chun-Tse Shao <ctshao@...gle.com>,
	Thomas Richter <tmricht@...ux.ibm.com>,
	Sumanth Korikkar <sumanthk@...ux.ibm.com>,
	Collin Funk <collin.funk1@...il.com>,
	Thomas Falcon <thomas.falcon@...el.com>,
	Howard Chu <howardchu95@...il.com>,
	Dapeng Mi <dapeng1.mi@...ux.intel.com>,
	Levi Yun <yeoreum.yun@....com>,
	Yang Li <yang.lee@...ux.alibaba.com>, linux-kernel@...r.kernel.org,
	linux-perf-users@...r.kernel.org, Andi Kleen <ak@...ux.intel.com>,
	Weilin Wang <weilin.wang@...el.com>, Leo Yan <leo.yan@....com>
Subject: Re: [PATCH v4 03/18] perf jevents: Add set of common metrics based
 on default ones

On Sat, Nov 15, 2025 at 07:29:29PM -0800, Ian Rogers wrote:
> On Sat, Nov 15, 2025 at 9:52 AM Namhyung Kim <namhyung@...nel.org> wrote:
> >
> > On Fri, Nov 14, 2025 at 08:57:39AM -0800, Ian Rogers wrote:
> > > On Fri, Nov 14, 2025 at 8:28 AM James Clark <james.clark@...aro.org> wrote:
> > > >
> > > >
> > > >
> > > > On 11/11/2025 9:21 pm, Ian Rogers wrote:
> > > > > Add support to getting a common set of metrics from a default
> > > > > table. It simplifies the generation to add json metrics at the same
> > > > > time. The metrics added are CPUs_utilized, cs_per_second,
> > > > > migrations_per_second, page_faults_per_second, insn_per_cycle,
> > > > > stalled_cycles_per_instruction, frontend_cycles_idle,
> > > > > backend_cycles_idle, cycles_frequency, branch_frequency and
> > > > > branch_miss_rate based on the shadow metric definitions.
> > > > >
> > > > > Following this change the default perf stat output on an alderlake
> > > > > looks like:
> > > > > ```
> > > > > $ perf stat -a -- sleep 2
> > > > >
> > > > >   Performance counter stats for 'system wide':
> > > > >
> > > > >                0.00 msec cpu-clock                        #    0.000 CPUs utilized
> > > > >              77,739      context-switches
> > > > >              15,033      cpu-migrations
> > > > >             321,313      page-faults
> > > > >      14,355,634,225      cpu_atom/instructions/           #    1.40  insn per cycle              (35.37%)
> > > > >     134,561,560,583      cpu_core/instructions/           #    3.44  insn per cycle              (57.85%)
> > > > >      10,263,836,145      cpu_atom/cycles/                                                        (35.42%)
> > > > >      39,138,632,894      cpu_core/cycles/                                                        (57.60%)
> > > > >       2,989,658,777      cpu_atom/branches/                                                      (42.60%)
> > > > >      32,170,570,388      cpu_core/branches/                                                      (57.39%)
> > > > >          29,789,870      cpu_atom/branch-misses/          #    1.00% of all branches             (42.69%)
> > > > >         165,991,152      cpu_core/branch-misses/          #    0.52% of all branches             (57.19%)
> > > > >                         (software)                 #      nan cs/sec  cs_per_second
> > > > >               TopdownL1 (cpu_core)                 #     11.9 %  tma_bad_speculation
> > > > >                                                    #     19.6 %  tma_frontend_bound       (63.97%)
> > > > >               TopdownL1 (cpu_core)                 #     18.8 %  tma_backend_bound
> > > > >                                                    #     49.7 %  tma_retiring             (63.97%)
> > > > >                         (software)                 #      nan faults/sec  page_faults_per_second
> > > > >                                                    #      nan GHz  cycles_frequency       (42.88%)
> > > > >                                                    #      nan GHz  cycles_frequency       (69.88%)
> > > > >               TopdownL1 (cpu_atom)                 #     11.7 %  tma_bad_speculation
> > > > >                                                    #     29.9 %  tma_retiring             (50.07%)
> > > > >               TopdownL1 (cpu_atom)                 #     31.3 %  tma_frontend_bound       (43.09%)
> > > > >                         (cpu_atom)                 #      nan M/sec  branch_frequency     (43.09%)
> > > > >                                                    #      nan M/sec  branch_frequency     (70.07%)
> > > > >                                                    #      nan migrations/sec  migrations_per_second
> > > > >               TopdownL1 (cpu_atom)                 #     27.1 %  tma_backend_bound        (43.08%)
> > > > >                         (software)                 #      0.0 CPUs  CPUs_utilized
> > > > >                                                    #      1.4 instructions  insn_per_cycle  (43.04%)
> > > > >                                                    #      3.5 instructions  insn_per_cycle  (69.99%)
> > > > >                                                    #      1.0 %  branch_miss_rate         (35.46%)
> > > > >                                                    #      0.5 %  branch_miss_rate         (65.02%)
> > > > >
> > > > >         2.005626564 seconds time elapsed
> > > > > ```
> > > > >
> > > > > Signed-off-by: Ian Rogers <irogers@...gle.com>
> > > > > ---
> > > > >   .../arch/common/common/metrics.json           |  86 +++++++++++++
> > > > >   tools/perf/pmu-events/empty-pmu-events.c      | 115 +++++++++++++-----
> > > > >   tools/perf/pmu-events/jevents.py              |  21 +++-
> > > > >   tools/perf/pmu-events/pmu-events.h            |   1 +
> > > > >   tools/perf/util/metricgroup.c                 |  31 +++--
> > > > >   5 files changed, 212 insertions(+), 42 deletions(-)
> > > > >   create mode 100644 tools/perf/pmu-events/arch/common/common/metrics.json
> > > > >
> > > > > diff --git a/tools/perf/pmu-events/arch/common/common/metrics.json b/tools/perf/pmu-events/arch/common/common/metrics.json
> > > > > new file mode 100644
> > > > > index 000000000000..d915be51e300
> > > > > --- /dev/null
> > > > > +++ b/tools/perf/pmu-events/arch/common/common/metrics.json
> > > > > @@ -0,0 +1,86 @@
> > > > > +[
> > > > > +    {
> > > > > +        "BriefDescription": "Average CPU utilization",
> > > > > +        "MetricExpr": "(software@cpu\\-clock\\,name\\=cpu\\-clock@ if #target_cpu else software@...k\\-clock\\,name\\=task\\-clock@) / (duration_time * 1e9)",
> > > >
> > > > Hi Ian,
> > > >
> > > > I noticed that this metric is making "perf stat tests" fail.
> > > > "duration_time" is a tool event and they don't work with "perf stat
> > > > record" anymore. The test tests the record command with the default args
> > > > which results in this event being used and a failure.
> > > >
> > > > I suppose there are three issues. First two are unrelated to this change:
> > > >
> > > >   - Perf stat record continues to write out a bad perf.data file even
> > > >     though it knows that tool events won't work.
> > > >
> > > >     For example 'status' ends up being -1 in cmd_stat() but it's ignored
> > > >     for some of the writing parts. It does decide to not print any stdout
> > > >     though:
> > > >
> > > >     $ perf stat record -e "duration_time"
> > > >     <blank>
> > > >
> > > >   - The other issue is obviously that tool events don't work with perf
> > > >     stat record which seems to be a regression from 6828d6929b76 ("perf
> > > >     evsel: Refactor tool events")
> > > >
> > > >   - The third issue is that this change adds a broken tool event to the
> > > >     default output of perf stat
> > > >
> > > > I'm not actually sure what "perf stat record" is for? It's possible that
> > > > it's not used anymore, expecially if nobody noticed that tool events
> > > > haven't been working in it for a while.
> > > >
> > > > I think we're also supposed to have json output for perf stat (although
> > > > this is also broken in some obscure scenarios), so maybe perf stat
> > > > record isn't needed anymore?
> > >
> > > Hi James,
> > >
> > > Thanks for the report. I think this is also an overlap with perf stat
> > > metrics don't work with perf stat record, and because these changes
> > > made that the default. Let me do some follow up work as the perf
> > > script work shows we can do useful things with metrics while not being
> > > on a live perf stat - there's the obstacle that the CPUID of the host
> > > will be used :-/
> > >
> > > Anyway, I'll take a look and we should add a test on this. There is
> > > one that the perf stat json output is okay, to some definition. One
> > > problem is that the stat-display code is complete spaghetti. Now that
> > > stat-shadow only handles json metrics, and perf script isn't trying to
> > > maintain a set of shadow counters, that is a little bit improved.
> >
> > I have another test failure on this.  On my AMD machine, perf all
> > metrics test fails due to missing "LLC-loads" events.
> >
> >   $ sudo perf stat -M llc_miss_rate true
> >   Error:
> >   No supported events found.
> >   The LLC-loads event is not supported.
> >
> > Maybe we need to make some cache metrics conditional as some events are
> > missing.
> 
> Maybe we can `perf list Default`, etc. for this is a problem. We have
> similar unsupported events in metrics on Intel like:
> 
> ```
> $ perf stat -M itlb_miss_rate -a sleep 1
> 
>  Performance counter stats for 'system wide':
> 
>    <not supported>      iTLB-loads
>            168,926      iTLB-load-misses
> 
>        1.002287122 seconds time elapsed
> ```
> 
> but I've not seen failures:
> 
> ```
> $ perf test -v "all metrics"
> 103: perf all metrics test                                           : Skip
> ```

  $ sudo perf test -v "all metrics"
  --- start ---
  test child forked, pid 1347112
  Testing CPUs_utilized
  Testing backend_cycles_idle
  Not supported events
  Performance counter stats for 'system wide': <not counted> cpu-cycles <not supported> stalled-cycles-backend 0.013162328 seconds time elapsed
  Testing branch_frequency
  Testing branch_miss_rate
  Testing cs_per_second
  Testing cycles_frequency
  Testing frontend_cycles_idle
  Testing insn_per_cycle
  Testing migrations_per_second
  Testing page_faults_per_second
  Testing stalled_cycles_per_instruction
  Testing l1d_miss_rate
  Testing llc_miss_rate
  Metric contains missing events
  Error: No supported events found. The LLC-loads event is not supported.
  Testing dtlb_miss_rate
  Testing itlb_miss_rate
  Testing l1i_miss_rate
  Testing l1_prefetch_miss_rate
  Not supported events
  Performance counter stats for 'system wide': <not counted> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 0.012983559 seconds time elapsed
  Testing branch_misprediction_ratio
  Testing all_remote_links_outbound
  Testing nps1_die_to_dram
  Testing all_l2_cache_accesses
  Testing all_l2_cache_hits
  Testing all_l2_cache_misses
  Testing ic_fetch_miss_ratio
  Testing l2_cache_accesses_from_l2_hwpf
  Testing l2_cache_misses_from_l2_hwpf
  Testing l3_read_miss_latency
  Testing l1_itlb_misses
  ---- end(-1) ----
  103: perf all metrics test                                           : FAILED!

Thanks,
Namhyung


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ