[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250218152818.158614-1-dapeng1.mi@linux.intel.com>
Date: Tue, 18 Feb 2025 15:27:54 +0000
From: Dapeng Mi <dapeng1.mi@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Andi Kleen <ak@...ux.intel.com>,
Eranian Stephane <eranian@...gle.com>
Cc: linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org,
Dapeng Mi <dapeng1.mi@...el.com>,
Dapeng Mi <dapeng1.mi@...ux.intel.com>
Subject: [Patch v2 00/24] Arch-PEBS and PMU supports for Clearwater Forest and Panther Lake
This v2 patch series is based on latest perf/core tree "1623ced247f7
(x86/events/amd/iommu: Increase IOMMU_NAME_SIZE)" + extra first two
patches of patch set "Cleanup for Intel PMU initialization"[1].
Changes:
v1 -> v2:
* Add Panther Lake PMU support (patch 02/24)
* Add PEBS static calls to avoid introducing too much
x86_pmu.arch_pebs checks (patch 07~08/24)
* Optimize PEBS constraints base on Kan's dynamic constranit patch
(patch 13/24)
* Split perf tools patch of supporting more vector registers to
several small patches (patch 20~22/24)
Tests:
* Run below tests on Clearwater Forest and no issue is found. Please
notice nmi_watchdog is disabled when running the tests.
a. Basic perf counting case.
perf stat -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}' sleep 1
b. Basic PMI based perf sampling case.
perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}' sleep 1
c. Basic PEBS based perf sampling case.
perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}:p' sleep 1
d. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
perf record -e branches:p -Iax,bx,ip,ssp,xmm0,ymmh0 -b -c 10000 sleep 1
e. PEBS sampling case with auxiliary (memory info) group
perf mem record sleep 1
f. PEBS sampling case with counter group
perf record -e '{branches:p,branches,cycles}:S' -c 10000 sleep 1
g. Perf stat and record test
perf test 95; perf test 119
h. perf-fuzzer test
* Run similar tests on Panther Lake P-cores and E-cores and no issue
is found. CPU 0 is P-core and CPU 9 is E-core. nmi_watchdog is
disabled as well.
P-core:
a. Basic perf counting case.
perf stat -e '{cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/cycles/,cpu_core/instructions/,cpu_core/ref-cycles/,cpu_core/slots/}' taskset -c 0 sleep 1
b. Basic PMI based perf sampling case.
perf record -e '{cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/cycles/,cpu_core/instructions/,cpu_core/ref-cycles/,cpu_core/slots/}' taskset -c 0 sleep 1
c. Basic PEBS based perf sampling case.
perf record -e '{cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/cycles/,cpu_core/instructions/,cpu_core/ref-cycles/,cpu_core/slots/}:p' taskset -c 0 sleep 1
d. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
perf record -e branches:p -Iax,bx,ip,ssp,xmm0,ymmh0 -b -c 10000 taskset -c 0 sleep 1
e. PEBS sampling case for user space registers
perf record -e branches:p --user-regs=ax,bx,ip -b -c 10000 taskset -c 0 sleep 1
f. PEBS sampling case with auxiliary (memory info) group
perf mem record taskset -c 0 sleep 1
g. PEBS sampling case with counter group
perf record -e '{branches:p,branches,cycles}:S' -c 10000 taskset -c 0 sleep 1
h. Perf stat and record test
perf test 95; perf test 119
E-core:
a. Basic perf counting case.
perf stat -e '{cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/cycles/,cpu_atom/instructions/,cpu_atom/ref-cycles/,cpu_atom/topdown-bad-spec/,cpu_atom/topdown-fe-bound/,cpu_atom/topdown-retiring/}' taskset -c 9 sleep 1
b. Basic PMI based perf sampling case.
perf record -e '{cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/cycles/,cpu_atom/instructions/,cpu_atom/ref-cycles/,cpu_atom/topdown-bad-spec/,cpu_atom/topdown-fe-bound/,cpu_atom/topdown-retiring/}' taskset -c 9 sleep 1
c. Basic PEBS based perf sampling case.
perf record -e '{cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/cycles/,cpu_atom/instructions/,cpu_atom/ref-cycles/,cpu_atom/topdown-bad-spec/,cpu_atom/topdown-fe-bound/,cpu_atom/topdown-retiring/}:p' taskset -c 9 sleep 1
d. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
perf record -e branches:p -Iax,bx,ip,ssp,xmm0,ymmh0 -b -c 10000 taskset -c sleep 1
e. PEBS sampling case for user space registers
perf record -e branches:p --user-regs=ax,bx,ip -b -c 10000 taskset -c 9 sleep 1
f. PEBS sampling case with auxiliary (memory info) group
perf mem record taskset -c 9 sleep 1
g. PEBS sampling case with counter group
perf record -e '{branches:p,branches,cycles}:S' -c 10000 taskset -c 9 sleep 1
History:
v1: https://lore.kernel.org/all/20250123140721.2496639-1-dapeng1.mi@linux.intel.com/
Ref:
[1]: https://lore.kernel.org/all/20250129154820.3755948-1-kan.liang@linux.intel.com/
Dapeng Mi (22):
perf/x86/intel: Add PMU support for Clearwater Forest
perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
perf/x86/intel: Decouple BTS initialization from PEBS initialization
perf/x86/intel: Rename x86_pmu.pebs to x86_pmu.ds_pebs
perf/x86/intel: Introduce pairs of PEBS static calls
perf/x86/intel: Initialize architectural PEBS
perf/x86/intel/ds: Factor out common PEBS processing code to functions
perf/x86/intel: Process arch-PEBS records or record fragments
perf/x86/intel: Factor out common functions to process PEBS groups
perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
perf/x86/intel: Update dyn_constranit base on PEBS event precise level
perf/x86/intel: Setup PEBS data configuration and enable legacy groups
perf/x86/intel: Add SSP register support for arch-PEBS
perf/x86/intel: Add counter group support for arch-PEBS
perf/core: Support to capture higher width vector registers
perf/x86/intel: Support arch-PEBS vector registers group capturing
perf tools: Support to show SSP register
perf tools: Enhance arch__intr/user_reg_mask() helpers
perf tools: Enhance sample_regs_user/intr to capture more registers
perf tools: Support to capture more vector registers (x86/Intel)
perf tools/tests: Add vector registers PEBS sampling test
perf tools: Fix incorrect --user-regs comments
Kan Liang (2):
perf/x86: Add dynamic constraint
perf/x86/intel: Add Panther Lake support
arch/arm/kernel/perf_regs.c | 6 +
arch/arm64/kernel/perf_regs.c | 6 +
arch/csky/kernel/perf_regs.c | 5 +
arch/loongarch/kernel/perf_regs.c | 5 +
arch/mips/kernel/perf_regs.c | 5 +
arch/powerpc/perf/perf_regs.c | 5 +
arch/riscv/kernel/perf_regs.c | 5 +
arch/s390/kernel/perf_regs.c | 5 +
arch/x86/events/core.c | 105 ++-
arch/x86/events/intel/bts.c | 6 +-
arch/x86/events/intel/core.c | 330 +++++++-
arch/x86/events/intel/ds.c | 722 ++++++++++++++----
arch/x86/events/intel/lbr.c | 2 +-
arch/x86/events/perf_event.h | 69 +-
arch/x86/include/asm/intel_ds.h | 10 +-
arch/x86/include/asm/msr-index.h | 28 +
arch/x86/include/asm/perf_event.h | 145 +++-
arch/x86/include/uapi/asm/perf_regs.h | 87 ++-
arch/x86/kernel/perf_regs.c | 55 +-
include/linux/perf_event.h | 3 +
include/linux/perf_regs.h | 10 +
include/uapi/linux/perf_event.h | 11 +
kernel/events/core.c | 53 +-
tools/arch/x86/include/uapi/asm/perf_regs.h | 90 ++-
tools/include/uapi/linux/perf_event.h | 14 +
tools/perf/arch/arm/util/perf_regs.c | 8 +-
tools/perf/arch/arm64/util/perf_regs.c | 11 +-
tools/perf/arch/csky/util/perf_regs.c | 8 +-
tools/perf/arch/loongarch/util/perf_regs.c | 8 +-
tools/perf/arch/mips/util/perf_regs.c | 8 +-
tools/perf/arch/powerpc/util/perf_regs.c | 17 +-
tools/perf/arch/riscv/util/perf_regs.c | 8 +-
tools/perf/arch/s390/util/perf_regs.c | 8 +-
tools/perf/arch/x86/util/perf_regs.c | 112 ++-
tools/perf/builtin-record.c | 2 +-
tools/perf/builtin-script.c | 23 +-
tools/perf/tests/shell/record.sh | 55 ++
tools/perf/util/evsel.c | 36 +-
tools/perf/util/intel-pt.c | 2 +-
tools/perf/util/parse-regs-options.c | 23 +-
.../perf/util/perf-regs-arch/perf_regs_x86.c | 90 +++
tools/perf/util/perf_regs.c | 8 +-
tools/perf/util/perf_regs.h | 20 +-
tools/perf/util/record.h | 4 +-
tools/perf/util/sample.h | 6 +-
tools/perf/util/session.c | 29 +-
tools/perf/util/synthetic-events.c | 6 +-
47 files changed, 1966 insertions(+), 308 deletions(-)
--
2.40.1
Powered by blists - more mailing lists