lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8fbf7fc5-2e38-4882-8835-49869b6dd47f@linux.intel.com>
Date: Tue, 17 Jun 2025 17:49:13 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>, kan.liang@...ux.intel.com
Cc: mingo@...hat.com, acme@...nel.org, namhyung@...nel.org,
 tglx@...utronix.de, dave.hansen@...ux.intel.com, irogers@...gle.com,
 adrian.hunter@...el.com, jolsa@...nel.org,
 alexander.shishkin@...ux.intel.com, linux-kernel@...r.kernel.org,
 ak@...ux.intel.com, zide.chen@...el.com
Subject: Re: [RFC PATCH 06/12] perf: Support extension of sample_regs


On 6/17/2025 4:14 PM, Peter Zijlstra wrote:
> On Fri, Jun 13, 2025 at 06:49:37AM -0700, kan.liang@...ux.intel.com wrote:
>> From: Kan Liang <kan.liang@...ux.intel.com>
>>
>> More regs may be required in a sample, e.g., the vector registers. The
>> current sample_regs_XXX has run out of space.
>>
>> Add sample_ext_regs_intr/user[2] in the struct perf_event_attr. It's used
>> as a bitmap for the extension regs. There will be more than 64 registers
>> added.
>> Add a new flag PERF_PMU_CAP_EXTENDED_REGS2 to indicate the PMU which
>> supports sample_ext_regs_intr/user.
>>
>> Extend the perf_reg_validate() to support the validation of the
>> extension regs.
>>
>> Extend the perf_reg_value() to retrieve the extension regs. The regs may
>> be larger than u64. Add two parameters to store the pointer and size.
>> Add a dedicated perf_output_sample_ext_regs() to dump the extension
>> regs.
>>
>> This is just a generic support for the extension regs. Any attempts to
>> manipulate the extension regs will error out, until the driver-specific
>> supports are implemented, which will be done in the following patch.
>>
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index 78a362b80027..e22ba72efcdb 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -382,6 +382,10 @@ enum perf_event_read_format {
>>  #define PERF_ATTR_SIZE_VER6			120	/* Add: aux_sample_size */
>>  #define PERF_ATTR_SIZE_VER7			128	/* Add: sig_data */
>>  #define PERF_ATTR_SIZE_VER8			136	/* Add: config3 */
>> +#define PERF_ATTR_SIZE_VER9			168	/* Add: sample_ext_regs_intr */
>> +							/* Add: sample_ext_regs_user */
>> +
>> +#define PERF_ATTR_EXT_REGS_SIZE			2
>>  
>>  /*
>>   * 'struct perf_event_attr' contains various attributes that define
>> @@ -543,6 +547,10 @@ struct perf_event_attr {
>>  	__u64	sig_data;
>>  
>>  	__u64	config3; /* extension of config2 */
>> +
>> +	/* extension of sample_regs_XXX */
>> +	__u64	sample_ext_regs_intr[PERF_ATTR_EXT_REGS_SIZE];
>> +	__u64	sample_ext_regs_user[PERF_ATTR_EXT_REGS_SIZE];
>>  };
> Did anybody read this email?
>
>   https://lkml.kernel.org/r/20250416155327.GD17910@noisy.programming.kicks-ass.net
>
> The current regs interface really was designed for regular registers,
> and trying to squish SIMD registers into it is a trainwreck.
>
> Not to mention that AAAARGHH64 and Risc-V have vector widths up to 2048
> bit.

Yes, we followed this discussion. In sample_ext_regs_intr/user[], each bit
represents an extended register regardless of how much bits does the
register have.  At the beginning we added a item "sample_simd_reg_words" to
represent the bit-width of the corresponding extended register, but we
found it's unnecessary since the regs bitmap is fixed for a specific arch
and the arch-specific code would know how many bits for the certain regs,
e.g., on x86 platform, the bit 0 would represent YMMH0 in this patchset ,
then the x86 specific perf code would know its bit-wdith is 128bits.

The reason that we define an array with 2 u64 words is that we plan to
support YMM (16 bits) + APX (16 bits) + OPMASK (8 bits) + ZMM (32 bits) +
SSP (1 bit) regs which needs 73 bits and one u64 word is not enough.


>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ