lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dc084dac-170d-434e-9d8c-ba11cbc8e008@linux.intel.com>
Date: Tue, 17 Jun 2025 20:14:36 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: kan.liang@...ux.intel.com, mingo@...hat.com, acme@...nel.org,
 namhyung@...nel.org, tglx@...utronix.de, dave.hansen@...ux.intel.com,
 irogers@...gle.com, adrian.hunter@...el.com, jolsa@...nel.org,
 alexander.shishkin@...ux.intel.com, linux-kernel@...r.kernel.org,
 ak@...ux.intel.com, zide.chen@...el.com
Subject: Re: [RFC PATCH 06/12] perf: Support extension of sample_regs


On 6/17/2025 6:28 PM, Peter Zijlstra wrote:
> On Tue, Jun 17, 2025 at 05:49:13PM +0800, Mi, Dapeng wrote:
>> On 6/17/2025 4:14 PM, Peter Zijlstra wrote:
>>> On Fri, Jun 13, 2025 at 06:49:37AM -0700, kan.liang@...ux.intel.com wrote:
>>>> From: Kan Liang <kan.liang@...ux.intel.com>
>>>>
>>>> More regs may be required in a sample, e.g., the vector registers. The
>>>> current sample_regs_XXX has run out of space.
>>>>
>>>> Add sample_ext_regs_intr/user[2] in the struct perf_event_attr. It's used
>>>> as a bitmap for the extension regs. There will be more than 64 registers
>>>> added.
>>>> Add a new flag PERF_PMU_CAP_EXTENDED_REGS2 to indicate the PMU which
>>>> supports sample_ext_regs_intr/user.
>>>>
>>>> Extend the perf_reg_validate() to support the validation of the
>>>> extension regs.
>>>>
>>>> Extend the perf_reg_value() to retrieve the extension regs. The regs may
>>>> be larger than u64. Add two parameters to store the pointer and size.
>>>> Add a dedicated perf_output_sample_ext_regs() to dump the extension
>>>> regs.
>>>>
>>>> This is just a generic support for the extension regs. Any attempts to
>>>> manipulate the extension regs will error out, until the driver-specific
>>>> supports are implemented, which will be done in the following patch.
>>>>
>>>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>>>> index 78a362b80027..e22ba72efcdb 100644
>>>> --- a/include/uapi/linux/perf_event.h
>>>> +++ b/include/uapi/linux/perf_event.h
>>>> @@ -382,6 +382,10 @@ enum perf_event_read_format {
>>>>  #define PERF_ATTR_SIZE_VER6			120	/* Add: aux_sample_size */
>>>>  #define PERF_ATTR_SIZE_VER7			128	/* Add: sig_data */
>>>>  #define PERF_ATTR_SIZE_VER8			136	/* Add: config3 */
>>>> +#define PERF_ATTR_SIZE_VER9			168	/* Add: sample_ext_regs_intr */
>>>> +							/* Add: sample_ext_regs_user */
>>>> +
>>>> +#define PERF_ATTR_EXT_REGS_SIZE			2
>>>>  
>>>>  /*
>>>>   * 'struct perf_event_attr' contains various attributes that define
>>>> @@ -543,6 +547,10 @@ struct perf_event_attr {
>>>>  	__u64	sig_data;
>>>>  
>>>>  	__u64	config3; /* extension of config2 */
>>>> +
>>>> +	/* extension of sample_regs_XXX */
>>>> +	__u64	sample_ext_regs_intr[PERF_ATTR_EXT_REGS_SIZE];
>>>> +	__u64	sample_ext_regs_user[PERF_ATTR_EXT_REGS_SIZE];
>>>>  };
>>> Did anybody read this email?
>>>
>>>   https://lkml.kernel.org/r/20250416155327.GD17910@noisy.programming.kicks-ass.net
>>>
>>> The current regs interface really was designed for regular registers,
>>> and trying to squish SIMD registers into it is a trainwreck.
>>>
>>> Not to mention that AAAARGHH64 and Risc-V have vector widths up to 2048
>>> bit.
>> Yes, we followed this discussion. In sample_ext_regs_intr/user[], each bit
>> represents an extended register regardless of how much bits does the
>> register have.  At the beginning we added a item "sample_simd_reg_words" to
>> represent the bit-width of the corresponding extended register, but we
>> found it's unnecessary since the regs bitmap is fixed for a specific arch
> So I disagree. Fundamentally we have only 32 SIMD registers on x86. We
> should not have more bits than that.
>
>> and the arch-specific code would know how many bits for the certain regs,
>> e.g., on x86 platform, the bit 0 would represent YMMH0 in this patchset ,
>> then the x86 specific perf code would know its bit-wdith is 128bits.
> This is nonsense; YMMH0 is not a register. It should never be in this
> array.
>
>> The reason that we define an array with 2 u64 words is that we plan to
>> support YMM (16 bits) + APX (16 bits) + OPMASK (8 bits) + ZMM (32 bits) +
>> SSP (1 bit) regs which needs 73 bits and one u64 word is not enough.
> This is insane. So now you're having 16 XMM 'regs', 16 YMMH 'regs' and
> 32 ZMMH 'regs' for a total of 64 bits that should have been just 32.
>
> Suppose we're going to be doing AVX-1024, because awesome. That means we
> need another 32 bits to denote whatever letter comes after 'Z'.
>
> So no, this idiocy stops now.
>
> We're going to do a sane SIMD register set with variable width, and
> reclaim the XMM regs from the normal set.

Ok, so we need to add two width variables like
sample_ext_regs_words_intr/user, then reuse the XMM regs bitmap to
represent the extend regs bitmap. Considering the OPMASK regs and APX
extended GPRs have same bit-width (64 bits), we may have to combine them
into a single bitmap, e.g. bits[15:0] represents R31~R16 and bits[23:16]
represents OPMASK7 ~ OPMASK0. 

Peter, is this what you expected? 



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ