[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d50b0407-c006-48c0-98dc-37d428d5aacf@linux.intel.com>
Date: Fri, 27 Jun 2025 17:23:15 -0400
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Dave Hansen <dave.hansen@...el.com>, peterz@...radead.org,
mingo@...hat.com, acme@...nel.org, namhyung@...nel.org, tglx@...utronix.de,
dave.hansen@...ux.intel.com, irogers@...gle.com, adrian.hunter@...el.com,
jolsa@...nel.org, alexander.shishkin@...ux.intel.com,
linux-kernel@...r.kernel.org
Cc: dapeng1.mi@...ux.intel.com, ak@...ux.intel.com, zide.chen@...el.com,
mark.rutland@....com, broonie@...nel.org, ravi.bangoria@....com
Subject: Re: [RFC PATCH V2 05/13] perf/x86: Support XMM register for non-PEBS
and REGS_USER
On 2025-06-27 10:35 a.m., Dave Hansen wrote:
> On 6/26/25 12:56, kan.liang@...ux.intel.com wrote:
>> +static void x86_pmu_get_ext_regs(struct x86_perf_regs *perf_regs, u64 mask)
>> +{
>> + struct xregs_state *xsave = per_cpu(ext_regs_buf, smp_processor_id());
>> +
>> + if (WARN_ON_ONCE(!xsave))
>> + return;
>> +
>> + xsaves_nmi(xsave, mask);
>
> This makes me a little nervous.
>
> Could we maybe keep a mask around that reminds us what 'ext_regs_buf'
> was sized for and then ensure that no bits in the passed-in mask are set
> in that?
>
The x86_pmu.ext_regs_mask tracks the available bits of
x86_pmu.ext_regs_buf. But it has its own format.
I will make it use the XSAVE format, and add a check here.
> I almost wonder if you want to add a
>
> struct fpu_state_config fpu_perf_cfg;
>
> I guess it's mostly overkill for this. But please do have a look at the
> data structures in:
>
> arch/x86/include/asm/fpu/types.h
>
It looks overkill. The perf usage is simple. It should be good enough to
have one mask to track the available bits. The size is from FPU's
xstate_calculate_size(). I think, as long as perf inputs the correct
mask, the size can be trusted.
>> + if (mask & XFEATURE_MASK_SSE &&
>> + xsave->header.xfeatures & BIT_ULL(XFEATURE_SSE))
>> + perf_regs->xmm_space = xsave->i387.xmm_space;
>> +}
>
> There's a lot going on here.
>
> 'mask' and 'xfeatures' have the exact same format. Why use
> XFEATURE_MASK_SSE for one and BIT_ULL(XFEATURE_SSE) for the other?
>
Ah, my bad. The same XFEATURE_MASK_SSE should be used.
> Why check both? How could a bit get into 'xfeatures' without being in
> 'mask'?
The 'mask' is what perf wants/configures. I think the 'xfeatures' is
what XSAVE really gives. I'm not quite sure if HW can always give us
everything we configured. If not, I think both checks are required.
I'm thinking to add the below first.
valid_mask = x86_pmu.ext_regs_mask & mask & xsave->header.xfeatures;
Then only use the valid_mask to check each XFEATURE.
>
> How does the caller handle the fact that ->xmm_space might be written or
> not?
>
For this series, the returned XMM value is zeroed if the ->xmm_space is
NULL.
But I should clear the nr_vectors. So nothing will be dumped to the
userspace if the ->xmm_space is not available. I will address it in V3.
Thanks,
Kan
Powered by blists - more mailing lists