lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250218152818.158614-1-dapeng1.mi@linux.intel.com>
Date: Tue, 18 Feb 2025 15:27:54 +0000
From: Dapeng Mi <dapeng1.mi@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Ian Rogers <irogers@...gle.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Kan Liang <kan.liang@...ux.intel.com>,
	Andi Kleen <ak@...ux.intel.com>,
	Eranian Stephane <eranian@...gle.com>
Cc: linux-kernel@...r.kernel.org,
	linux-perf-users@...r.kernel.org,
	Dapeng Mi <dapeng1.mi@...el.com>,
	Dapeng Mi <dapeng1.mi@...ux.intel.com>
Subject: [Patch v2 00/24] Arch-PEBS and PMU supports for Clearwater Forest and Panther Lake

This v2 patch series is based on latest perf/core tree "1623ced247f7
(x86/events/amd/iommu: Increase IOMMU_NAME_SIZE)" + extra first two
patches of patch set "Cleanup for Intel PMU initialization"[1].

Changes:

  v1 -> v2:
    * Add Panther Lake PMU support (patch 02/24)
    * Add PEBS static calls to avoid introducing too much
      x86_pmu.arch_pebs checks (patch 07~08/24)
    * Optimize PEBS constraints base on Kan's dynamic constranit patch
      (patch 13/24)
    * Split perf tools patch of supporting more vector registers to
      several small patches (patch 20~22/24)

Tests:

  * Run below tests on Clearwater Forest and no issue is found. Please
    notice nmi_watchdog is disabled when running the tests.

  a. Basic perf counting case.
    perf stat -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}' sleep 1

  b. Basic PMI based perf sampling case.
    perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}' sleep 1

  c. Basic PEBS based perf sampling case.
    perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles,topdown-bad-spec,topdown-fe-bound,topdown-retiring}:p' sleep 1

  d. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
    perf record -e branches:p -Iax,bx,ip,ssp,xmm0,ymmh0 -b -c 10000 sleep 1

  e. PEBS sampling case with auxiliary (memory info) group
    perf mem record sleep 1

  f. PEBS sampling case with counter group
    perf record -e '{branches:p,branches,cycles}:S' -c 10000 sleep 1

  g. Perf stat and record test
    perf test 95; perf test 119

  h. perf-fuzzer test


  * Run similar tests on Panther Lake P-cores and E-cores and no issue
    is found. CPU 0 is P-core and CPU 9 is E-core. nmi_watchdog is
    disabled as well.

  P-core:

  a. Basic perf counting case.
    perf stat -e '{cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/cycles/,cpu_core/instructions/,cpu_core/ref-cycles/,cpu_core/slots/}' taskset -c 0 sleep 1

  b. Basic PMI based perf sampling case.
    perf record -e '{cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/cycles/,cpu_core/instructions/,cpu_core/ref-cycles/,cpu_core/slots/}' taskset -c 0 sleep 1

  c. Basic PEBS based perf sampling case.
    perf record -e '{cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/branches/,cpu_core/cycles/,cpu_core/instructions/,cpu_core/ref-cycles/,cpu_core/slots/}:p' taskset -c 0 sleep 1

  d. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
    perf record -e branches:p -Iax,bx,ip,ssp,xmm0,ymmh0 -b -c 10000 taskset -c 0 sleep 1

  e. PEBS sampling case for user space registers
    perf record -e branches:p --user-regs=ax,bx,ip -b -c 10000 taskset -c 0 sleep 1

  f. PEBS sampling case with auxiliary (memory info) group
    perf mem record taskset -c 0 sleep 1

  g. PEBS sampling case with counter group
    perf record -e '{branches:p,branches,cycles}:S' -c 10000 taskset -c 0 sleep 1

  h. Perf stat and record test
    perf test 95; perf test 119

  E-core:

  a. Basic perf counting case.
    perf stat -e '{cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/cycles/,cpu_atom/instructions/,cpu_atom/ref-cycles/,cpu_atom/topdown-bad-spec/,cpu_atom/topdown-fe-bound/,cpu_atom/topdown-retiring/}' taskset -c 9 sleep 1

  b. Basic PMI based perf sampling case.
    perf record -e '{cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/cycles/,cpu_atom/instructions/,cpu_atom/ref-cycles/,cpu_atom/topdown-bad-spec/,cpu_atom/topdown-fe-bound/,cpu_atom/topdown-retiring/}' taskset -c 9 sleep 1
  c. Basic PEBS based perf sampling case.
    perf record -e '{cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/branches/,cpu_atom/cycles/,cpu_atom/instructions/,cpu_atom/ref-cycles/,cpu_atom/topdown-bad-spec/,cpu_atom/topdown-fe-bound/,cpu_atom/topdown-retiring/}:p' taskset -c 9 sleep 1

  d. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
    perf record -e branches:p -Iax,bx,ip,ssp,xmm0,ymmh0 -b -c 10000 taskset -c  sleep 1

  e. PEBS sampling case for user space registers
    perf record -e branches:p --user-regs=ax,bx,ip -b -c 10000 taskset -c 9 sleep 1

  f. PEBS sampling case with auxiliary (memory info) group
    perf mem record taskset -c 9 sleep 1

  g. PEBS sampling case with counter group
    perf record -e '{branches:p,branches,cycles}:S' -c 10000 taskset -c 9 sleep 1

History:
  v1: https://lore.kernel.org/all/20250123140721.2496639-1-dapeng1.mi@linux.intel.com/

Ref:
  [1]: https://lore.kernel.org/all/20250129154820.3755948-1-kan.liang@linux.intel.com/


Dapeng Mi (22):
  perf/x86/intel: Add PMU support for Clearwater Forest
  perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
  perf/x86/intel: Decouple BTS initialization from PEBS initialization
  perf/x86/intel: Rename x86_pmu.pebs to x86_pmu.ds_pebs
  perf/x86/intel: Introduce pairs of PEBS static calls
  perf/x86/intel: Initialize architectural PEBS
  perf/x86/intel/ds: Factor out common PEBS processing code to functions
  perf/x86/intel: Process arch-PEBS records or record fragments
  perf/x86/intel: Factor out common functions to process PEBS groups
  perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
  perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  perf/x86/intel: Add SSP register support for arch-PEBS
  perf/x86/intel: Add counter group support for arch-PEBS
  perf/core: Support to capture higher width vector registers
  perf/x86/intel: Support arch-PEBS vector registers group capturing
  perf tools: Support to show SSP register
  perf tools: Enhance arch__intr/user_reg_mask() helpers
  perf tools: Enhance sample_regs_user/intr to capture more registers
  perf tools: Support to capture more vector registers (x86/Intel)
  perf tools/tests: Add vector registers PEBS sampling test
  perf tools: Fix incorrect --user-regs comments

Kan Liang (2):
  perf/x86: Add dynamic constraint
  perf/x86/intel: Add Panther Lake support

 arch/arm/kernel/perf_regs.c                   |   6 +
 arch/arm64/kernel/perf_regs.c                 |   6 +
 arch/csky/kernel/perf_regs.c                  |   5 +
 arch/loongarch/kernel/perf_regs.c             |   5 +
 arch/mips/kernel/perf_regs.c                  |   5 +
 arch/powerpc/perf/perf_regs.c                 |   5 +
 arch/riscv/kernel/perf_regs.c                 |   5 +
 arch/s390/kernel/perf_regs.c                  |   5 +
 arch/x86/events/core.c                        | 105 ++-
 arch/x86/events/intel/bts.c                   |   6 +-
 arch/x86/events/intel/core.c                  | 330 +++++++-
 arch/x86/events/intel/ds.c                    | 722 ++++++++++++++----
 arch/x86/events/intel/lbr.c                   |   2 +-
 arch/x86/events/perf_event.h                  |  69 +-
 arch/x86/include/asm/intel_ds.h               |  10 +-
 arch/x86/include/asm/msr-index.h              |  28 +
 arch/x86/include/asm/perf_event.h             | 145 +++-
 arch/x86/include/uapi/asm/perf_regs.h         |  87 ++-
 arch/x86/kernel/perf_regs.c                   |  55 +-
 include/linux/perf_event.h                    |   3 +
 include/linux/perf_regs.h                     |  10 +
 include/uapi/linux/perf_event.h               |  11 +
 kernel/events/core.c                          |  53 +-
 tools/arch/x86/include/uapi/asm/perf_regs.h   |  90 ++-
 tools/include/uapi/linux/perf_event.h         |  14 +
 tools/perf/arch/arm/util/perf_regs.c          |   8 +-
 tools/perf/arch/arm64/util/perf_regs.c        |  11 +-
 tools/perf/arch/csky/util/perf_regs.c         |   8 +-
 tools/perf/arch/loongarch/util/perf_regs.c    |   8 +-
 tools/perf/arch/mips/util/perf_regs.c         |   8 +-
 tools/perf/arch/powerpc/util/perf_regs.c      |  17 +-
 tools/perf/arch/riscv/util/perf_regs.c        |   8 +-
 tools/perf/arch/s390/util/perf_regs.c         |   8 +-
 tools/perf/arch/x86/util/perf_regs.c          | 112 ++-
 tools/perf/builtin-record.c                   |   2 +-
 tools/perf/builtin-script.c                   |  23 +-
 tools/perf/tests/shell/record.sh              |  55 ++
 tools/perf/util/evsel.c                       |  36 +-
 tools/perf/util/intel-pt.c                    |   2 +-
 tools/perf/util/parse-regs-options.c          |  23 +-
 .../perf/util/perf-regs-arch/perf_regs_x86.c  |  90 +++
 tools/perf/util/perf_regs.c                   |   8 +-
 tools/perf/util/perf_regs.h                   |  20 +-
 tools/perf/util/record.h                      |   4 +-
 tools/perf/util/sample.h                      |   6 +-
 tools/perf/util/session.c                     |  29 +-
 tools/perf/util/synthetic-events.c            |   6 +-
 47 files changed, 1966 insertions(+), 308 deletions(-)

-- 
2.40.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ