[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com>
Date: Fri, 20 Jun 2025 10:38:56 +0000
From: Dapeng Mi <dapeng1.mi@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Andi Kleen <ak@...ux.intel.com>,
Eranian Stephane <eranian@...gle.com>
Cc: linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org,
Dapeng Mi <dapeng1.mi@...el.com>,
Dapeng Mi <dapeng1.mi@...ux.intel.com>
Subject: [Patch v4 00/13] arch-PEBS enabling for Intel platforms
This patchset introduces architectural PEBS support for Intel platforms
like Clearwater Forest (CWF) and Panther Lake (PTL). The detailed
information about arch-PEBS can be found in chapter 11
"architectural PEBS" of "Intel Architecture Instruction Set Extensions
and Future Features".
Comparing with v3 patchset, the most significant change is to remove the
sampling support for new SIMD regs (OPMASK/YMM/ZMM). Considering the
complication of supporting SIMD regs sampling, the SIMD regs sampling
support is extracted as an independent patchset[1] and this patchset only
focus on the arch-PEBS enabling itself. Once the basic SIMD regs sampling
is supported, the arch-PEBS based SIMD regs (OPMASK/YMM/ZMM) sampling
would be added on top of the basic SIMD regs sampling.
Changes:
v3 -> v4:
* Rebase code to 6.16-rc2
* Extract the new SIMD regs sampling to an independent patchset
* Fix the PEBS buffer allocation issue (Peter)
* Fix the arch-PEBS dynamic constraints issue (Kan)
Tests:
Run below tests on Clearwater Forest and Pantherlake, no issue is
found.
1. Basic perf counting case.
perf stat -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}' sleep 1
2. Basic PMI based perf sampling case.
perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}' sleep 1
3. Basic PEBS based perf sampling case.
perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}:p' sleep 1
4. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
perf record -e branches:p -Iax,bx,ip,ssp,xmm0 -b -c 10000 sleep 1
5. User space PEBS sampling case with basic GPRs and LBR groups
perf record -e branches:p --user-regs=ax,bx,ip -b -c 10000 sleep 1
6 PEBS sampling case with auxiliary (memory info) group
perf mem record sleep 1
7. PEBS sampling case with counter group
perf record -e '{branches:p,branches,cycles}:S' -c 10000 sleep 1
8. Perf stat and record test
perf test 96; perf test 125
History:
v3: https://lore.kernel.org/all/20250415114428.341182-1-dapeng1.mi@linux.intel.com/
v2: https://lore.kernel.org/all/20250218152818.158614-1-dapeng1.mi@linux.intel.com/
v1: https://lore.kernel.org/all/20250123140721.2496639-1-dapeng1.mi@linux.intel.com/
Ref:
[1]: https://lore.kernel.org/all/20250613134943.3186517-1-kan.liang@linux.intel.com/
Dapeng Mi (13):
perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call
perf/x86/intel: Correct large PEBS flag check
perf/x86/intel: Initialize architectural PEBS
perf/x86/intel/ds: Factor out PEBS record processing code to functions
perf/x86/intel/ds: Factor out PEBS group processing code to functions
perf/x86/intel: Process arch-PEBS records or record fragments
perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
perf/x86/intel: Update dyn_constranit base on PEBS event precise level
perf/x86/intel: Setup PEBS data configuration and enable legacy groups
perf/x86/intel: Add counter group support for arch-PEBS
perf/x86: Support to sample SSP register
perf/x86/intel: Support to sample SSP register for arch-PEBS
perf tools: x86: Support to show SSP register
arch/x86/events/core.c | 37 +-
arch/x86/events/intel/core.c | 256 +++++++-
arch/x86/events/intel/ds.c | 595 ++++++++++++++----
arch/x86/events/perf_event.h | 46 +-
arch/x86/include/asm/intel_ds.h | 10 +-
arch/x86/include/asm/msr-index.h | 20 +
arch/x86/include/asm/perf_event.h | 117 +++-
arch/x86/include/uapi/asm/perf_regs.h | 4 +-
arch/x86/kernel/perf_regs.c | 7 +
tools/arch/x86/include/uapi/asm/perf_regs.h | 7 +-
tools/perf/arch/x86/util/perf_regs.c | 2 +
tools/perf/util/intel-pt.c | 2 +-
.../perf/util/perf-regs-arch/perf_regs_x86.c | 2 +
13 files changed, 959 insertions(+), 146 deletions(-)
base-commit: e04c78d86a9699d136910cfc0bdcf01087e3267e
--
2.43.0
Powered by blists - more mailing lists