[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFF6gdxVyp36ADOi@J2N7QTR9R3>
Date: Tue, 17 Jun 2025 15:24:01 +0100
From: Mark Rutland <mark.rutland@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>, kan.liang@...ux.intel.com,
mingo@...hat.com, acme@...nel.org, namhyung@...nel.org,
tglx@...utronix.de, dave.hansen@...ux.intel.com, irogers@...gle.com,
adrian.hunter@...el.com, jolsa@...nel.org,
alexander.shishkin@...ux.intel.com, linux-kernel@...r.kernel.org,
ak@...ux.intel.com, zide.chen@...el.com
Subject: Re: [RFC PATCH 06/12] perf: Support extension of sample_regs
On Tue, Jun 17, 2025 at 04:06:17PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 17, 2025 at 03:33:33PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 17, 2025 at 08:14:36PM +0800, Mi, Dapeng wrote:
> >
> > > > We're going to do a sane SIMD register set with variable width, and
> > > > reclaim the XMM regs from the normal set.
> > >
> > > Ok, so we need to add two width variables like
> > > sample_ext_regs_words_intr/user,
> >
> > s/ext/simd/
> >
> > Not sure it makes sense to have separate vector widths for kernel and
> > user regs, but sure.
> >
> > > then reuse the XMM regs bitmap to represent the extend regs bitmap.
> >
> > But its not extended; its the normal bitmap.
> >
> > > Considering the OPMASK regs and APX
> > > extended GPRs have same bit-width (64 bits), we may have to combine them
> > > into a single bitmap, e.g. bits[15:0] represents R31~R16 and bits[23:16]
> > > represents OPMASK7 ~ OPMASK0.
> >
> > Again confused, bits 0:23 are the normal registers (in a lunatic
> > order). The XMM regs are in 32:63 and will be free if the SIMD thing is
> > present.
> >
> > SPP+APX should definitely go there.
> >
> > Not sure about OPMASK; those really do belong with the SIMD state. Let
> > me go figure out what ARM and Risc-V look like in more detail.
>
> So ARM-SVE has 32 vector registers with 16 predicate registers.
>
> Risc-V Zv seems to only have 32 vector registers; no special purpose
> predicate registers, instead a regular vector register can be used as a
> predicate register.
>
> PowerPC VSX has 64 vector registers and no predicate registers afaict.
>
> While reading this, I came across the useful note that predicate
> registers are 1/8-th the length of the vector registers (because the
> minimal element is a byte). So while the current AVX-512 predicate
> registers are indeed 64bits, this would no longer be true for the
> hypothetical AVX-1024 (or even AVX-512 if we allow 4bit elements).
>
> As such, I don't think we should stick the predicate registers in the
> normal group -- they really are not normal registers and won't fit for
> future extensions.
>
> This then leaves us two options:
>
> - stick the predicate registers in the high bits of the vector register
> word, or
>
> - add an explicit predicate register word.
TBH, I don't think we can handle extended state in a generic way unless
we treat this like a ptrace regset, and delegate the format of each
specific register set to the architecture code.
On arm64, the behaviour is modal (with two different vector lengths for
streaming/non-streaming SVE when SME is implemented), per-task
configurable (with different vector lengths), can differ between
host/guest for KVM, and some of the registers only exist in some
configurations (e.g. the FFR only exists for SME if FA64 is
implemented).
Mark.
Powered by blists - more mailing lists