lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMHbIGRFeQlq9ABx@google.com>
Date: Wed, 10 Sep 2025 13:10:08 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Kan Liang <kan.liang@...ux.intel.com>,
	James Clark <james.clark@...aro.org>, Xu Yang <xu.yang_2@....com>,
	Thomas Falcon <thomas.falcon@...el.com>,
	Andi Kleen <ak@...ux.intel.com>, linux-kernel@...r.kernel.org,
	linux-perf-users@...r.kernel.org, bpf@...r.kernel.org,
	Atish Patra <atishp@...osinc.com>,
	Beeman Strong <beeman@...osinc.com>, Leo Yan <leo.yan@....com>,
	Vince Weaver <vincent.weaver@...ne.edu>
Subject: Re: [PATCH v3 00/15] Legacy hardware/cache events as json

Hi Ian,

On Thu, Aug 28, 2025 at 01:59:15PM -0700, Ian Rogers wrote:
> Mirroring similar work for software events in commit 6e9fa4131abb
> ("perf parse-events: Remove non-json software events"). These changes
> migrate the legacy hardware and cache events to json.  With no hard
> coded legacy hardware or cache events the wild card, case
> insensitivity, etc. is consistent for events. This does, however, mean
> events like cycles will wild card against all PMUs. A change doing the
> same was originally posted and merged from:
> https://lore.kernel.org/r/20240416061533.921723-10-irogers@google.com
> and reverted by Linus in commit 4f1b067359ac ("Revert "perf
> parse-events: Prefer sysfs/JSON hardware events over legacy"") due to
> his dislike for the cycles behavior on ARM with perf record. Earlier
> patches in this series make perf record event opening failures
> non-fatal and hide the cycles event's failure to open on ARM in perf
> record, so it is expected the behavior will now be transparent in perf
> record on ARM. perf stat with a cycles event will wildcard open the
> event on all PMUs.
> 
> The change to support legacy events with PMUs was done to clean up
> Intel's hybrid PMU implementation. Having sysfs/json events with
> increased priority to legacy was requested by Mark Rutland
>  <mark.rutland@....com> to fix Apple-M PMU issues wrt broken legacy
> events on that PMU. It is believed the PMU driver is now fixed, but
> this has only been confirmed on ARM Juno boards. It was requested that
> RISC-V be able to add events to the perf tool json so the PMU driver
> didn't need to map legacy events to config encodings:
> https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/
> This patch series achieves this.
> 
> A previous series of patches decreasing legacy hardware event
> priorities was posted in:
> https://lore.kernel.org/lkml/20250416045117.876775-1-irogers@google.com/
> Namhyung Kim <namhyung@...nel.org> mentioned that hardware and
> software events can be implemented similarly:
> https://lore.kernel.org/lkml/aIJmJns2lopxf3EK@google.com/
> and this patch series achieves this.

Thanks for working on this.  Yeah, I think it's be easier to handle all
events consistently with JSON.  I expect the sysfs encoding will be used
in a higher priority if it comes with <PMU>/<EVENT>/ format.

> 
> Note, patch 1 (perf parse-events: Fix legacy cache events if event is
> duplicated in a PMU) fixes a function deleted by patch 15 (perf
> parse-events: Remove hard coded legacy hardware and cache
> parsing). Adding the json exposed an issue when legacy cache (not
> legacy hardware) and sysfs/json events exist. The fix is necessary to
> keep tests passing through the series. It is also posted for backports
> to stable trees.

Sounds ok.

> 
> The perf list behavior includes a lot more information and events. The
> before behavior on a hybrid alderlake is:
> ```
> $ perf list hw
> 
> List of pre-defined events (to be used in -e or -M):
> 
>   branch-instructions OR branches                    [Hardware event]
>   branch-misses                                      [Hardware event]
>   bus-cycles                                         [Hardware event]
>   cache-misses                                       [Hardware event]
>   cache-references                                   [Hardware event]
>   cpu-cycles OR cycles                               [Hardware event]
>   instructions                                       [Hardware event]
>   ref-cycles                                         [Hardware event]
> $ perf list hwcache
> 
> List of pre-defined events (to be used in -e or -M):
> 
> 
> cache:
>   L1-dcache-loads OR cpu_atom/L1-dcache-loads/
>   L1-dcache-stores OR cpu_atom/L1-dcache-stores/
>   L1-icache-loads OR cpu_atom/L1-icache-loads/
>   L1-icache-load-misses OR cpu_atom/L1-icache-load-misses/
>   LLC-loads OR cpu_atom/LLC-loads/
>   LLC-load-misses OR cpu_atom/LLC-load-misses/
>   LLC-stores OR cpu_atom/LLC-stores/
>   LLC-store-misses OR cpu_atom/LLC-store-misses/
>   dTLB-loads OR cpu_atom/dTLB-loads/
>   dTLB-load-misses OR cpu_atom/dTLB-load-misses/
>   dTLB-stores OR cpu_atom/dTLB-stores/
>   dTLB-store-misses OR cpu_atom/dTLB-store-misses/
>   iTLB-load-misses OR cpu_atom/iTLB-load-misses/
>   branch-loads OR cpu_atom/branch-loads/
>   branch-load-misses OR cpu_atom/branch-load-misses/
>   L1-dcache-loads OR cpu_core/L1-dcache-loads/
>   L1-dcache-load-misses OR cpu_core/L1-dcache-load-misses/
>   L1-dcache-stores OR cpu_core/L1-dcache-stores/
>   L1-icache-load-misses OR cpu_core/L1-icache-load-misses/
>   LLC-loads OR cpu_core/LLC-loads/
>   LLC-load-misses OR cpu_core/LLC-load-misses/
>   LLC-stores OR cpu_core/LLC-stores/
>   LLC-store-misses OR cpu_core/LLC-store-misses/
>   dTLB-loads OR cpu_core/dTLB-loads/
>   dTLB-load-misses OR cpu_core/dTLB-load-misses/
>   dTLB-stores OR cpu_core/dTLB-stores/
>   dTLB-store-misses OR cpu_core/dTLB-store-misses/
>   iTLB-load-misses OR cpu_core/iTLB-load-misses/
>   branch-loads OR cpu_core/branch-loads/
>   branch-load-misses OR cpu_core/branch-load-misses/
>   node-loads OR cpu_core/node-loads/
>   node-load-misses OR cpu_core/node-load-misses/
> ```
> and after it is:
> ```
> $ perf list hw
> 
> legacy hardware:
>   branch-instructions
>        [Retired branch instructions [This event is an alias of branches].
>         Unit: cpu_atom]
>   branch-misses
>        [Mispredicted branch instructions. Unit: cpu_atom]
>   branches
>        [Retired branch instructions [This event is an alias of
>         branch-instructions]. Unit: cpu_atom]

A nit.  Can we have one actual event and an alias of it?

I think 'branch-instructions' will be the actual event and 'branches'
will be the alias.  Then the description will be like

  branch-instructions
      [Retired branch instructions.  Unit: cpu_atom]
  ...

  branches
      [This event is an alias of branch-instructions.]

The same goes to 'cycles' and 'cpu-cycles'.

Thanks,
Namhyung


>   bus-cycles
>        [Bus cycles,which can be different from total cycles. Unit: cpu_atom]
>   cache-misses
>        [Cache misses. Usually this indicates Last Level Cache misses; this is
>         intended to be used in conjunction with the
>         PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates.
>         Unit: cpu_atom]
>   cache-references
>        [Cache accesses. Usually this indicates Last Level Cache accesses but
>         this may vary depending on your CPU. This may include prefetches and
>         coherency messages; again this depends on the design of your CPU.
>         Unit: cpu_atom]
>   cpu-cycles
>        [Total cycles. Be wary of what happens during CPU frequency scaling
>         [This event is an alias of cycles]. Unit: cpu_atom]
>   cycles
>        [Total cycles. Be wary of what happens during CPU frequency scaling
>         [This event is an alias of cpu-cycles]. Unit: cpu_atom]
>   instructions
>        [Retired instructions. Be careful,these can be affected by various
>         issues,most notably hardware interrupt counts. Unit: cpu_atom]
>   ref-cycles
>        [Total cycles; not affected by CPU frequency scaling. Unit: cpu_atom]
>   branch-instructions
>        [Retired branch instructions [This event is an alias of branches].
>         Unit: cpu_core]
>   branch-misses
>        [Mispredicted branch instructions. Unit: cpu_core]
>   branches
>        [Retired branch instructions [This event is an alias of
>         branch-instructions]. Unit: cpu_core]
>   bus-cycles
>        [Bus cycles,which can be different from total cycles. Unit: cpu_core]
>   cache-misses
>        [Cache misses. Usually this indicates Last Level Cache misses; this is
>         intended to be used in conjunction with the
>         PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates.
>         Unit: cpu_core]
>   cache-references
>        [Cache accesses. Usually this indicates Last Level Cache accesses but
>         this may vary depending on your CPU. This may include prefetches and
>         coherency messages; again this depends on the design of your CPU.
>         Unit: cpu_core]
>   cpu-cycles
>        [Total cycles. Be wary of what happens during CPU frequency scaling
>         [This event is an alias of cycles]. Unit: cpu_core]
>   cycles
>        [Total cycles. Be wary of what happens during CPU frequency scaling
>         [This event is an alias of cpu-cycles]. Unit: cpu_core]
>   instructions
>        [Retired instructions. Be careful,these can be affected by various
>         issues,most notably hardware interrupt counts. Unit: cpu_core]
>   ref-cycles
>        [Total cycles; not affected by CPU frequency scaling. Unit: cpu_core]
> $ perf list hwcache
> 
> legacy cache:
>   branch-load-misses
>        [Branch prediction unit read misses. Unit: cpu_atom]
>   branch-loads
>        [Branch prediction unit read accesses. Unit: cpu_atom]
>   dtlb-load-misses
>        [Data TLB read misses. Unit: cpu_atom]
>   dtlb-loads
>        [Data TLB read accesses. Unit: cpu_atom]
>   dtlb-store-misses
>        [Data TLB write misses. Unit: cpu_atom]
>   dtlb-stores
>        [Data TLB write accesses. Unit: cpu_atom]
>   itlb-load-misses
>        [Instruction TLB read misses. Unit: cpu_atom]
>   l1-dcache-loads
>        [Level 1 data cache read accesses. Unit: cpu_atom]
>   l1-dcache-stores
>        [Level 1 data cache write accesses. Unit: cpu_atom]
>   l1-icache-load-misses
>        [Level 1 instruction cache read misses. Unit: cpu_atom]
>   l1-icache-loads
>        [Level 1 instruction cache read accesses. Unit: cpu_atom]
>   llc-load-misses
>        [Last level cache read misses. Unit: cpu_atom]
>   llc-loads
>        [Last level cache read accesses. Unit: cpu_atom]
>   llc-store-misses
>        [Last level cache write misses. Unit: cpu_atom]
>   llc-stores
>        [Last level cache write accesses. Unit: cpu_atom]
>   branch-load-misses
>        [Branch prediction unit read misses. Unit: cpu_core]
>   branch-loads
>        [Branch prediction unit read accesses. Unit: cpu_core]
>   dtlb-load-misses
>        [Data TLB read misses. Unit: cpu_core]
>   dtlb-loads
>        [Data TLB read accesses. Unit: cpu_core]
>   dtlb-store-misses
>        [Data TLB write misses. Unit: cpu_core]
>   dtlb-stores
>        [Data TLB write accesses. Unit: cpu_core]
>   itlb-load-misses
>        [Instruction TLB read misses. Unit: cpu_core]
>   l1-dcache-load-misses
>        [Level 1 data cache read misses. Unit: cpu_core]
>   l1-dcache-loads
>        [Level 1 data cache read accesses. Unit: cpu_core]
>   l1-dcache-stores
>        [Level 1 data cache write accesses. Unit: cpu_core]
>   l1-icache-load-misses
>        [Level 1 instruction cache read misses. Unit: cpu_core]
>   llc-load-misses
>        [Last level cache read misses. Unit: cpu_core]
>   llc-loads
>        [Last level cache read accesses. Unit: cpu_core]
>   llc-store-misses
>        [Last level cache write misses. Unit: cpu_core]
>   llc-stores
>        [Last level cache write accesses. Unit: cpu_core]
>   node-load-misses
>        [Local memory read misses. Unit: cpu_core]
>   node-loads
>        [Local memory read accesses. Unit: cpu_core]
> ```
> 
> v3: Deprecate the legacy cache events that aren't shown in the
>     previous perf list to avoid the perf list output being too verbose.
> 
> v2: Additional details to the cover letter. Credit to Vince Weaver
>     added to the commit message for the event details. Additional
>     patches to clean up perf_pmu new_alias by removing an unused term
>     scanner argument and avoid stdio usage.
>     https://lore.kernel.org/lkml/20250828163225.3839073-1-irogers@google.com/
> 
> v1: https://lore.kernel.org/lkml/20250828064231.1762997-1-irogers@google.com/
> 
> Ian Rogers (15):
>   perf parse-events: Fix legacy cache events if event is duplicated in a
>     PMU
>   perf perf_api_probe: Avoid scanning all PMUs, try software PMU first
>   perf record: Skip don't fail for events that don't open
>   perf jevents: Support copying the source json files to OUTPUT
>   perf pmu: Don't eagerly parse event terms
>   perf parse-events: Remove unused FILE input argument to scanner
>   perf pmu: Use fd rather than FILE from new_alias
>   perf pmu: Factor term parsing into a perf_event_attr into a helper
>   perf parse-events: Add terms for legacy hardware and cache config
>     values
>   perf jevents: Add legacy json terms and default_core event table
>     helper
>   perf pmu: Add and use legacy_terms in alias information
>   perf jevents: Add legacy-hardware and legacy-cache json
>   perf print-events: Remove print_hwcache_events
>   perf print-events: Remove print_symbol_events
>   perf parse-events: Remove hard coded legacy hardware and cache parsing
> 
>  tools/perf/Makefile.perf                      |   21 +-
>  tools/perf/arch/x86/util/intel-pt.c           |    2 +-
>  tools/perf/builtin-list.c                     |   34 +-
>  tools/perf/builtin-record.c                   |   89 +-
>  tools/perf/pmu-events/Build                   |   24 +-
>  .../arch/common/common/legacy-hardware.json   |   72 +
>  tools/perf/pmu-events/empty-pmu-events.c      | 2763 ++++++++++++++++-
>  tools/perf/pmu-events/jevents.py              |   24 +
>  tools/perf/pmu-events/make_legacy_cache.py    |  129 +
>  tools/perf/pmu-events/pmu-events.h            |    1 +
>  tools/perf/tests/parse-events.c               |    2 +-
>  tools/perf/tests/pmu-events.c                 |   24 +-
>  tools/perf/tests/pmu.c                        |    3 +-
>  tools/perf/util/parse-events.c                |  283 +-
>  tools/perf/util/parse-events.h                |   16 +-
>  tools/perf/util/parse-events.l                |   54 +-
>  tools/perf/util/parse-events.y                |  114 +-
>  tools/perf/util/perf_api_probe.c              |   27 +-
>  tools/perf/util/pmu.c                         |  302 +-
>  tools/perf/util/print-events.c                |  112 -
>  tools/perf/util/print-events.h                |    4 -
>  21 files changed, 3330 insertions(+), 770 deletions(-)
>  create mode 100644 tools/perf/pmu-events/arch/common/common/legacy-hardware.json
>  create mode 100755 tools/perf/pmu-events/make_legacy_cache.py
> 
> -- 
> 2.51.0.318.gd7df087d1a-goog
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ