lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20251203065500.2597594-1-dapeng1.mi@linux.intel.com>
Date: Wed,  3 Dec 2025 14:54:41 +0800
From: Dapeng Mi <dapeng1.mi@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Ian Rogers <irogers@...gle.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Jiri Olsa <jolsa@...nel.org>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Andi Kleen <ak@...ux.intel.com>,
	Eranian Stephane <eranian@...gle.com>
Cc: Mark Rutland <mark.rutland@....com>,
	broonie@...nel.org,
	Ravi Bangoria <ravi.bangoria@....com>,
	linux-kernel@...r.kernel.org,
	linux-perf-users@...r.kernel.org,
	Zide Chen <zide.chen@...el.com>,
	Falcon Thomas <thomas.falcon@...el.com>,
	Dapeng Mi <dapeng1.mi@...el.com>,
	Xudong Hao <xudong.hao@...el.com>,
	Dapeng Mi <dapeng1.mi@...ux.intel.com>
Subject: [Patch v5 00/19] Support SIMD/eGPRs/SSP registers sampling for perf

Changes since V4:
- Rewrite some functions comments and commit messages (Dave)
- Add arch-PEBS based SIMD/eGPRs/SSP sampling support (Patch 15/19)
- Fix "suspecious NMI" warnning observed on PTL/NVL P-core and DMR by
  activating back-to-back NMI detection mechanism (Patch 16/19)
- Fix some minor issues on perf-tool patches (Patch 18/19)

Changes since V3:
- Drop the SIMD registers if an NMI hits kernel mode for REGS_USER.
- Only dump the available regs, rather than zero and dump the
  unavailable regs. It's possible that the dumped registers are a subset
  of the requested registers.
- Some minor updates to address Dapeng's comments in V3.

Changes since V2:
- Use the FPU format for the x86_pmu.ext_regs_mask as well
- Add a check before invoking xsaves_nmi()
- Add perf_simd_reg_check() to retrieve the number of available
  registers. If the kernel fails to get the requested registers, e.g.,
  XSAVES fails, nothing dumps to the userspace (the V2 dumps all 0s).
- Add POC perf tool patches

Changes since V1:
- Apply the new interfaces to configure and dump the SIMD registers
- Utilize the existing FPU functions, e.g., xstate_calculate_size,
  get_xsave_addr().

Starting from Intel Ice Lake, XMM registers can be collected in a PEBS
record. Future Architecture PEBS will include additional registers such
as YMM, ZMM, OPMASK, SSP and APX eGPRs, contingent on hardware support.

This patch set introduces a software solution to mitigate the hardware
requirement by utilizing the XSAVES command to retrieve the requested
registers in the overflow handler. This feature is no longer limited to
PEBS events or specific platforms. While the hardware solution remains
preferable due to its lower overhead and higher accuracy, this software
approach provides a viable alternative.

The solution is theoretically compatible with all x86 platforms but is
currently enabled on newer platforms, including Sapphire Rapids and
later P-core server platforms, Sierra Forest and later E-core server
platforms and recent Client platforms, like Arrow Lake, Panther Lake and
Nova Lake.

Newly supported registers include YMM, ZMM, OPMASK, SSP, and APX eGPRs.
Due to space constraints in sample_regs_user/intr, new fields have been 
introduced in the perf_event_attr structure to accommodate these
registers.

After a long discussion in V1,
https://lore.kernel.org/lkml/3f1c9a9e-cb63-47ff-a5e9-06555fa6cc9a@linux.intel.com/
The below new fields are introduced.

@@ -543,6 +545,25 @@ struct perf_event_attr {
        __u64   sig_data;

        __u64   config3; /* extension of config2 */
+
+
+       /*
+        * Defines set of SIMD registers to dump on samples.
+        * The sample_simd_regs_enabled !=0 implies the
+        * set of SIMD registers is used to config all SIMD registers.
+        * If !sample_simd_regs_enabled, sample_regs_XXX may be used to
+        * config some SIMD registers on X86.
+        */
+       union {
+               __u16 sample_simd_regs_enabled;
+               __u16 sample_simd_pred_reg_qwords;
+       };
+       __u32 sample_simd_pred_reg_intr;
+       __u32 sample_simd_pred_reg_user;
+       __u16 sample_simd_vec_reg_qwords;
+       __u64 sample_simd_vec_reg_intr;
+       __u64 sample_simd_vec_reg_user;
+       __u32 __reserved_4;
 };
@@ -1016,7 +1037,15 @@ enum perf_event_type {
         *      } && PERF_SAMPLE_BRANCH_STACK
         *
         *      { u64                   abi; # enum perf_sample_regs_abi
-        *        u64                   regs[weight(mask)]; } && PERF_SAMPLE_REGS_USER
+        *        u64                   regs[weight(mask)];
+        *        struct {
+        *              u16 nr_vectors;
+        *              u16 vector_qwords;
+        *              u16 nr_pred;
+        *              u16 pred_qwords;
+        *              u64 data[nr_vectors * vector_qwords + nr_pred * pred_qwords];
+        *        } && (abi & PERF_SAMPLE_REGS_ABI_SIMD)
+        *      } && PERF_SAMPLE_REGS_USER
         *
         *      { u64                   size;
         *        char                  data[size];
@@ -1043,7 +1072,15 @@ enum perf_event_type {
         *      { u64                   data_src; } && PERF_SAMPLE_DATA_SRC
         *      { u64                   transaction; } && PERF_SAMPLE_TRANSACTION
         *      { u64                   abi; # enum perf_sample_regs_abi
-        *        u64                   regs[weight(mask)]; } && PERF_SAMPLE_REGS_INTR
+        *        u64                   regs[weight(mask)];
+        *        struct {
+        *              u16 nr_vectors;
+        *              u16 vector_qwords;
+        *              u16 nr_pred;
+        *              u16 pred_qwords;
+        *              u64 data[nr_vectors * vector_qwords + nr_pred * pred_qwords];
+        *        } && (abi & PERF_SAMPLE_REGS_ABI_SIMD)
+        *      } && PERF_SAMPLE_REGS_INTR
         *      { u64                   phys_addr;} && PERF_SAMPLE_PHYS_ADDR
         *      { u64                   cgroup;} && PERF_SAMPLE_CGROUP
         *      { u64                   data_page_size;} && PERF_SAMPLE_DATA_PAGE_SIZE


To maintain simplicity, a single field, sample_{simd|pred}_vec_reg_qwords,
is introduced to indicate register width. For example:
- sample_simd_vec_reg_qwords = 2 for XMM registers (128 bits) on x86
- sample_simd_vec_reg_qwords = 4 for YMM registers (256 bits) on x86

Four additional fields, sample_{simd|pred}_vec_reg_{intr|user}, represent
the bitmap of sampling registers. For instance, the bitmap for x86
XMM registers is 0xffff (16 XMM registers). Although users can
theoretically sample a subset of registers, the current perf-tool
implementation supports sampling all registers of each type to avoid
complexity.

A new ABI, PERF_SAMPLE_REGS_ABI_SIMD, is introduced to signal user space 
tools about the presence of SIMD registers in sampling records. When this
flag is detected, tools should recognize that extra SIMD register data
follows the general register data. The layout of the extra SIMD register
data is displayed as follow.

   u16 nr_vectors;
   u16 vector_qwords;
   u16 nr_pred;
   u16 pred_qwords;
   u64 data[nr_vectors * vector_qwords + nr_pred * pred_qwords];

With this patch set, sampling for the aforementioned registers is
supported on the Intel Nova Lake platform.

Examples:
 $perf record -I?
 available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
 R11 R12 R13 R14 R15 R16 R17 R18 R19 R20 R21 R22 R23 R24 R25 R26 R27 R28
 R29 R30 R31 SSP XMM0-15 YMM0-15 ZMM0-31 OPMASK0-7

 $perf record --user-regs=?
 available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
 R11 R12 R13 R14 R15 R16 R17 R18 R19 R20 R21 R22 R23 R24 R25 R26 R27 R28
 R29 R30 R31 SSP XMM0-15 YMM0-15 ZMM0-31 OPMASK0-7

 $perf record -e branches:p -Iax,bx,r8,r16,r31,ssp,xmm,ymm,zmm,opmask -c 100000 ./test
 $perf report -D

 ... ...
 14027761992115 0xcf30 [0x8a8]: PERF_RECORD_SAMPLE(IP, 0x1): 29964/29964:
 0xffffffff9f085e24 period: 100000 addr: 0
 ... intr regs: mask 0x18001010003 ABI 64-bit
 .... AX    0xdffffc0000000000
 .... BX    0xffff8882297685e8
 .... R8    0x0000000000000000
 .... R16   0x0000000000000000
 .... R31   0x0000000000000000
 .... SSP   0x0000000000000000
 ... SIMD ABI nr_vectors 32 vector_qwords 8 nr_pred 8 pred_qwords 1
 .... ZMM  [0] 0xffffffffffffffff
 .... ZMM  [0] 0x0000000000000001
 .... ZMM  [0] 0x0000000000000000
 .... ZMM  [0] 0x0000000000000000
 .... ZMM  [0] 0x0000000000000000
 .... ZMM  [0] 0x0000000000000000
 .... ZMM  [0] 0x0000000000000000
 .... ZMM  [0] 0x0000000000000000
 .... ZMM  [1] 0x003a6b6165506d56
 ... ...
 .... ZMM  [31] 0x0000000000000000
 .... ZMM  [31] 0x0000000000000000
 .... ZMM  [31] 0x0000000000000000
 .... ZMM  [31] 0x0000000000000000
 .... ZMM  [31] 0x0000000000000000
 .... ZMM  [31] 0x0000000000000000
 .... ZMM  [31] 0x0000000000000000
 .... ZMM  [31] 0x0000000000000000
 .... OPMASK[0] 0x00000000fffffe00
 .... OPMASK[1] 0x0000000000ffffff
 .... OPMASK[2] 0x000000000000007f
 .... OPMASK[3] 0x0000000000000000
 .... OPMASK[4] 0x0000000000010080
 .... OPMASK[5] 0x0000000000000000
 .... OPMASK[6] 0x0000400004000000
 .... OPMASK[7] 0x0000000000000000
 ... ...


History:
  v4: https://lore.kernel.org/all/20250925061213.178796-1-dapeng1.mi@linux.intel.com/
  v3: https://lore.kernel.org/lkml/20250815213435.1702022-1-kan.liang@linux.intel.com/
  v2: https://lore.kernel.org/lkml/20250626195610.405379-1-kan.liang@linux.intel.com/
  v1: https://lore.kernel.org/lkml/20250613134943.3186517-1-kan.liang@linux.intel.com/

Dapeng Mi (3):
  perf: Eliminate duplicate arch-specific functions definations
  perf/x86/intel: Enable arch-PEBS based SIMD/eGPRs/SSP sampling
  perf/x86: Activate back-to-back NMI detection for arch-PEBS induced
    NMIs

Kan Liang (16):
  perf/x86: Use x86_perf_regs in the x86 nmi handler
  perf/x86: Introduce x86-specific x86_pmu_setup_regs_data()
  x86/fpu/xstate: Add xsaves_nmi() helper
  perf: Move and rename has_extended_regs() for ARCH-specific use
  perf/x86: Add support for XMM registers in non-PEBS and REGS_USER
  perf: Add sampling support for SIMD registers
  perf/x86: Enable XMM sampling using sample_simd_vec_reg_* fields
  perf/x86: Enable YMM sampling using sample_simd_vec_reg_* fields
  perf/x86: Enable ZMM sampling using sample_simd_vec_reg_* fields
  perf/x86: Enable OPMASK sampling using sample_simd_pred_reg_* fields
  perf/x86: Enable eGPRs sampling using sample_regs_* fields
  perf/x86: Enable SSP sampling using sample_regs_* fields
  perf/x86/intel: Enable PERF_PMU_CAP_SIMD_REGS capability
  perf headers: Sync with the kernel headers
  perf parse-regs: Support new SIMD sampling format
  perf regs: Enable dumping of SIMD registers

 arch/arm/kernel/perf_regs.c                   |   8 +-
 arch/arm64/kernel/perf_regs.c                 |   8 +-
 arch/csky/kernel/perf_regs.c                  |   8 +-
 arch/loongarch/kernel/perf_regs.c             |   8 +-
 arch/mips/kernel/perf_regs.c                  |   8 +-
 arch/parisc/kernel/perf_regs.c                |   8 +-
 arch/powerpc/perf/perf_regs.c                 |   2 +-
 arch/riscv/kernel/perf_regs.c                 |   8 +-
 arch/s390/kernel/perf_regs.c                  |   2 +-
 arch/x86/events/core.c                        | 326 +++++++++++-
 arch/x86/events/intel/core.c                  | 117 ++++-
 arch/x86/events/intel/ds.c                    | 134 ++++-
 arch/x86/events/perf_event.h                  |  85 +++-
 arch/x86/include/asm/fpu/xstate.h             |   3 +
 arch/x86/include/asm/msr-index.h              |   7 +
 arch/x86/include/asm/perf_event.h             |  38 +-
 arch/x86/include/uapi/asm/perf_regs.h         |  62 +++
 arch/x86/kernel/fpu/xstate.c                  |  25 +-
 arch/x86/kernel/perf_regs.c                   | 131 ++++-
 include/linux/perf_event.h                    |  16 +
 include/linux/perf_regs.h                     |  36 +-
 include/uapi/linux/perf_event.h               |  45 +-
 kernel/events/core.c                          | 132 ++++-
 tools/arch/x86/include/uapi/asm/perf_regs.h   |  62 +++
 tools/include/uapi/linux/perf_event.h         |  45 +-
 tools/perf/arch/x86/util/perf_regs.c          | 470 +++++++++++++++++-
 tools/perf/util/evsel.c                       |  47 ++
 tools/perf/util/parse-regs-options.c          | 151 +++++-
 .../perf/util/perf-regs-arch/perf_regs_x86.c  |  43 ++
 tools/perf/util/perf_event_attr_fprintf.c     |   6 +
 tools/perf/util/perf_regs.c                   |  59 +++
 tools/perf/util/perf_regs.h                   |  11 +
 tools/perf/util/record.h                      |   6 +
 tools/perf/util/sample.h                      |  10 +
 tools/perf/util/session.c                     |  78 ++-
 35 files changed, 2012 insertions(+), 193 deletions(-)


base-commit: 9929dffce5ed7e2988e0274f4db98035508b16d9
prerequisite-patch-id: a15bcd62a8dcd219d17489eef88b66ea5488a2a0
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ