[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220604042820.2270916-1-leo.yan@linaro.org>
Date: Sat, 4 Jun 2022 12:28:03 +0800
From: Leo Yan <leo.yan@...aro.org>
To: Arnaldo Carvalho de Melo <acme@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
John Garry <john.garry@...wei.com>,
Will Deacon <will@...nel.org>,
James Clark <james.clark@....com>,
German Gomez <german.gomez@....com>,
Ali Saidi <alisaidi@...zon.com>, Joe Mario <jmario@...hat.com>,
Adam Li <adam.li@...erecomputing.com>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org
Cc: Leo Yan <leo.yan@...aro.org>
Subject: [PATCH v5 00/17] perf c2c: Support data source and display for Arm64
Arm64 Neoverse CPUs supports data source in Arm SPE trace, this allows
us to detect cache line contention and transfers.
This patch set includes Ali's patch set v9 "perf: arm-spe: Decode SPE
source and use for perf c2c" [1] and rebased on the latest perf core
banch with latest commit 1bcca2b1bd67 ("perf vendor events intel:
Update metrics for Alderlake").
Patches 01-05 comes from Ali's patch set to support data source for Arm
SPE for neoverse cores.
Patches 06-17 are patches from patch set v4 for support perf c2c peer
display for Arm64 [2].
This patch set has been verified for both x86 perf memory events and Arm
SPE events.
[1] https://lore.kernel.org/lkml/20220517020326.18580-1-alisaidi@amazon.com/
[2] https://lore.kernel.org/lkml/20220530114036.3225544-1-leo.yan@linaro.org/
Changes from v4:
* Included Ali's patch set for adding data source in Arm SPE samples;
* Added Ian's ACK and Ali's review and test tags;
* Update document for the default peer dispaly for Arm64 (Ali).
Changes from v3:
* Changed to display remote and local peer accesses (Joe);
* Fixed the usage info for display types (Joe);
* Do not display HITM dimensions when use 'peer' display, and HITM
display doesn't show any 'peer' dimensions (James);
* Split to smaller patches for adding dimensions of peer operations;
* Updated documentation to reflect the latest GUI and stdio.
Changes from v2:
* Updated patch 04 to account metrics for both cache level and ld_peer
for PEER flag;
* Updated document for metric 'rmt_hit' which is accounted for all
remote accesses (include remote DRAM and any upward caches).
Changes from v1:
* Updated patches 01, 02 and 03 to support 'N/A' metrics for store
operations, so can align with the patch set [1] for store samples.
Ali Saidi (3):
perf: Add SNOOP_PEER flag to perf mem data struct
perf tools: sync addition of PERF_MEM_SNOOPX_PEER
perf arm-spe: Use SPE data source for neoverse cores
Leo Yan (14):
perf mem: Print snoop peer flag
perf arm-spe: Don't set data source if it's not a memory operation
perf mem: Add statistics for peer snooping
perf c2c: Output statistics for peer snooping
perf c2c: Add dimensions for peer load operations
perf c2c: Add dimensions of peer metrics for cache line view
perf c2c: Add mean dimensions for peer operations
perf c2c: Use explicit names for display macros
perf c2c: Rename dimension from 'percent_hitm' to
'percent_costly_snoop'
perf c2c: Refactor node header
perf c2c: Refactor display string
perf c2c: Sort on peer snooping for load operations
perf c2c: Use 'peer' as default display for Arm64
perf c2c: Update documentation for new display option 'peer'
include/uapi/linux/perf_event.h | 2 +-
tools/include/uapi/linux/perf_event.h | 2 +-
tools/perf/Documentation/perf-c2c.txt | 31 +-
tools/perf/builtin-c2c.c | 454 ++++++++++++++----
.../util/arm-spe-decoder/arm-spe-decoder.c | 1 +
.../util/arm-spe-decoder/arm-spe-decoder.h | 12 +
tools/perf/util/arm-spe.c | 140 +++++-
tools/perf/util/mem-events.c | 46 +-
tools/perf/util/mem-events.h | 3 +
9 files changed, 550 insertions(+), 141 deletions(-)
--
2.25.1
Powered by blists - more mailing lists