[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240622-matcher-greeting-9574fd23c340@spud>
Date: Sat, 22 Jun 2024 12:42:46 +0100
From: Conor Dooley <conor@...nel.org>
To: Jesse Taube <jesse@...osinc.com>
Cc: Charlie Jenkins <charlie@...osinc.com>,
Conor Dooley <conor.dooley@...rochip.com>,
linux-riscv@...ts.infradead.org, Jonathan Corbet <corbet@....net>,
Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
Albert Ou <aou@...s.berkeley.edu>, Rob Herring <robh@...nel.org>,
Krzysztof Kozlowski <krzk+dt@...nel.org>,
Clément Léger <cleger@...osinc.com>,
Evan Green <evan@...osinc.com>,
Andrew Jones <ajones@...tanamicro.com>,
Xiao Wang <xiao.w.wang@...el.com>, Andy Chiu <andy.chiu@...ive.com>,
Eric Biggers <ebiggers@...gle.com>,
Greentime Hu <greentime.hu@...ive.com>,
Björn Töpel <bjorn@...osinc.com>,
Heiko Stuebner <heiko@...ech.de>,
Costa Shulyupin <costa.shul@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Baoquan He <bhe@...hat.com>, Anup Patel <apatel@...tanamicro.com>,
Zong Li <zong.li@...ive.com>,
Sami Tolvanen <samitolvanen@...gle.com>,
Ben Dooks <ben.dooks@...ethink.co.uk>,
Alexandre Ghiti <alexghiti@...osinc.com>,
"Gustavo A. R. Silva" <gustavoars@...nel.org>,
Erick Archer <erick.archer@....com>,
Joel Granados <j.granados@...sung.com>, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, devicetree@...r.kernel.org
Subject: Re: [PATCH v2 4/6] RISC-V: Detect unaligned vector accesses
supported.
On Fri, Jun 21, 2024 at 02:07:58PM -0400, Jesse Taube wrote:
>
>
> On 6/21/24 13:18, Charlie Jenkins wrote:
> > On Fri, Jun 21, 2024 at 11:06:31AM +0100, Conor Dooley wrote:
> > > On Thu, Jun 20, 2024 at 03:14:14PM -0700, Charlie Jenkins wrote:
> > > > On Thu, Jun 20, 2024 at 05:31:28PM -0400, Jesse Taube wrote:
> > > > >
> > > > >
> > > > > On 6/17/24 22:09, Charlie Jenkins wrote:
> > > > > > On Mon, Jun 17, 2024 at 06:43:32PM -0700, Charlie Jenkins wrote:
> > > > > > > On Thu, Jun 13, 2024 at 03:16:13PM -0400, Jesse Taube wrote:
> > > > > > > > Run a unaligned vector access to test if the system supports
> > > > > > > > vector unaligned access. Add the result to a new key in hwprobe.
> > > > > > > > This is useful for usermode to know if vector misaligned accesses are
> > > > > > > > supported and if they are faster or slower than equivalent byte accesses.
> > > > > > > >
> > > > > > > > Signed-off-by: Jesse Taube <jesse@...osinc.com>
> > > > > > > > ---
> > > > > > > > V1 -> V2:
> > > > > > > > - Add Kconfig options
> > > > > > > > - Add insn_is_vector
> > > > > > > > - Add handle_vector_misaligned_load
> > > > > > > > - Fix build
> > > > > > > > - Seperate vector from scalar misaligned access
> > > > > > > > - This patch was almost completely rewritten
> > > > > > > > ---
> > > > > > > > arch/riscv/Kconfig | 41 +++++++
> > > > > > > > arch/riscv/include/asm/cpufeature.h | 7 +-
> > > > > > > > arch/riscv/include/asm/entry-common.h | 11 --
> > > > > > > > arch/riscv/include/asm/hwprobe.h | 2 +-
> > > > > > > > arch/riscv/include/asm/vector.h | 1 +
> > > > > > > > arch/riscv/include/uapi/asm/hwprobe.h | 5 +
> > > > > > > > arch/riscv/kernel/Makefile | 4 +-
> > > > > > > > arch/riscv/kernel/sys_hwprobe.c | 41 +++++++
> > > > > > > > arch/riscv/kernel/traps_misaligned.c | 119 ++++++++++++++++++++-
> > > > > > > > arch/riscv/kernel/unaligned_access_speed.c | 9 +-
> > > > > > > > arch/riscv/kernel/vector.c | 2 +-
> > > > > > > > 11 files changed, 221 insertions(+), 21 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > > > > > > > index b94176e25be1..f12df0ca6c18 100644
> > > > > > > > --- a/arch/riscv/Kconfig
> > > > > > > > +++ b/arch/riscv/Kconfig
> > > > > > > > @@ -723,6 +723,12 @@ config RISCV_MISALIGNED
> > > > > > > > help
> > > > > > > > Embed support for emulating misaligned loads and stores.
> > > > > > > > +config RISCV_VECTOR_MISALIGNED
> > > > > > > > + bool
> > > > > > > > + depends on RISCV_ISA_V
> > > > > > > > + help
> > > > > > > > + Enable detecting support for vector misaligned loads and stores.
> > > > > > > > +
> > > > > > > > choice
> > > > > > > > prompt "Unaligned Accesses Support"
> > > > > > > > default RISCV_PROBE_UNALIGNED_ACCESS
> > > > > > > > @@ -774,6 +780,41 @@ config RISCV_EFFICIENT_UNALIGNED_ACCESS
> > > > > > > > endchoice
> > > > > > > > +choice
> > > > > > > > + prompt "Vector unaligned Accesses Support"
> > > > > > > > + depends on RISCV_ISA_V
> > > > > > > > + default RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
> > > > > > > > + help
> > > > > > > > + This determines the level of support for vector unaligned accesses. This
> > > > > > > > + information is used by the kernel to perform optimizations.
> > >
> > > I haven't actually checked the patchset, but is it actually used by the
> > > kernel to perform optimisations?
> >
> > No ;)
> >
> > Right now this patch is just providing the information to userspace
> > through hwprobe and doesn't optimize anything in the kernel.
> >
> > >
> > > > > > > > It is also
> > > > > > > > + exposed to user space via the hwprobe syscall. The hardware will be
> > > > > > > > + probed at boot by default.
> > > > > > > > +
> > > > > > > > +config RISCV_DETECT_VECTOR_UNALIGNED_ACCESS
> > > > > > >
> > > > > > > This is not used anywhere, what is the reason for including it?
> > > > >
> > > > > This is so that we can check if they are supported or not, but not check the
> > > > > speed of them. Similar to RISCV_EMULATED_UNALIGNED_ACCESS.
> > > >
> > > > What do you mean? It isn't used anywhere so this "check if they are
> > > > supported or not" is not guarded by this config.
> > > >
> > > > >
> > > > > > >
> > > > > > > > + bool "Detect support for vector unaligned accesses"
> > > > > > > > + select RISCV_VECTOR_MISALIGNED
> > > > > > > > + help
> > > > > > > > + During boot, the kernel will detect if the system supports vector
> > > > > > > > + unaligned accesses.
> > > > > > > > +
> > > > > > > > +config RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
> > > > > > > > + bool "Probe speed of vector unaligned accesses"
> > > > > > > > + select RISCV_VECTOR_MISALIGNED
> > > > > > > > + help
> > > > > > > > + During boot, the kernel will run a series of tests to determine the
> > > > > > > > + speed of vector unaligned accesses if they are supported. This probing
> > > > > > > > + will dynamically determine the speed of vector unaligned accesses on
> > > > > > > > + the underlying system if they are supported.
> > > > > > > > +
> > > > > > > > +config CONFIG_RISCV_UNALIGNED_ACCESS_UNSUPPORTED
> > > > > > >
> > > > > > > This should not be prefixed with CONFIG and does not include VECTOR in
> > > > > > > the name.
> > > > >
> > > > > Huh thought it would warn fixed though
> > > >
> > > > What do you mean by "warn fixed"?
> > > >
> > > > >
> > > > > > I assume you meant to put
> > > > > > > "RISCV_VEC_UNALIGNED_ACCESS_UNSUPPORTED" here?
> > > > >
> > > > > This is to leave a faster path like SLOW or FAST to say that unaligned
> > > > > access arent suported.
> > > >
> > > > I am not sure what you are responding to. This comment seems to be
> > > > responding to my correction of
> > > > CONFIG_RISCV_UNALIGNED_ACCESS_UNSUPPORTED->RISCV_VEC_UNALIGNED_ACCESS_UNSUPPORTED
> > > > so I don't see how that ties into SLOW/FAST.
> > > >
> > > > >
> > > > > > >
> > > > > > > This was also intentionally left out on the scalar side [1]. The
> > > > > > > implication here is that having this config will cause people to compile
> > > > > > > kernels without unaligned access support which really shouldn't be
> > > > > > > something we are explicitly supporting.
> > > > > > >
> > > > > > > If somebody does want to support hardware that does not handle vector
> > > > > > > unaligned accesses, the solution should be to add emulation support to
> > > > > > > the kernel.
> > > > >
> > > > > Yes but we dont have emulation support yet so I do think its a good idea.
> > > >
> > > > I am hesitant because it is very likely that somebody will add support
> > > > for unaligned vector emulation. When there is emulation support, this
> > > > config option should not exist to be consistent with scalar. However if
> > > > we add this option in now, we must expect a user to enable this config,
> > > > and then
> > >
> > > I dunno, I think there could be value in having the option here. For
> > > scalar, we couldn't have an option that would break the uABI, so the
> > > unsupported option wasn't okay. That's not a constraint that we have for
> > > vector.
> > >
> > > For vector, if you have a system that doesn't support misaligned access,
> > > you probably don't want to emulate the accesses either, since that's
> > > likely remove any performance gains you get from using vector in the
> > > first place, so I can see benefit in the option.
> >
> > We have the RISCV_MISALIGNED option that disables the scalar emulation,
> > but doesn't break the UABI because it's not saying that misaligned
> > scalar is not supported, but just that the kernel doesn't emulate.
> > Having an UNSUPPORTED option explicitly breaks the UABI which doesn't
> > seem like something that the kernel should support. If we are okay with
> > having options that break the UABI then this is fine, but I was under
> > the impression that we did our best to avoid that.
> >
> > There definitely is value in having an option like this for hardware
> > that never wants to run misaligned code, but since we decided with the
> > scalar accesses that we should not break the UABI I think vector should
> > do the same.
>
> The rational for scalar accesses was slightly different as
> The base ISA spec said/says: ". The base ISA supports misaligned accesses,
> but these might run extremely slowly depending on the implementation."
>
> Section: 2.6 Load and Store Instructions. Available:
> https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf
Right, the rationale for the Scalar stuff is that the uABI requires them,
cos when the port was merged the ISA required them to be supported.
The same constraint doesn't exist for vector, so there is no uABI
concern.
For scalar, the options must end up with a system that supports
misaligned access - hence anything without in-kernel emulation is marked
NONPORTABLE, since we cannot be sure that all systems will have
emulation in their firmware and/or support in the hardware.
For vector, emulation might be nice to have (or not given extra code and
complexity for shit performance that makes using vector not worthwhile,
but that's another argument), but it is not required for uABI reasons.
None of the options in this series make any different to the uABI
anyway, no matter what you choose, if the hardware doesn't support
misaligned accesses, you're gonna have problems.
> > > Enabling the probing is going to end up with same outcome for userspace
> > > as having this option on such a system, so it comes down to whether you
> > > want to allow people to skip the probing if they know their system has
> > > this problem.
> > >
> > > > we will have to get rid of it later. Users are not always happy
> > > > when config options are removed.
> > >
> > > I dunno, I don't think that adding emulation requires that we remove
> > > this unsupported option.
> >
> > I am probably being too paranoid about this but I am hesistant to cause
> > vector and scalar misaligned access implemenations to diverge. It is
> > inconsistent to allow an unsupported option for vector but not for
> > scalar when both support emulation.
> > The probing is very fast as it just
> > checks if a misaligned access causes a trap and then sets the access
> > speed to unsupported if it does trap.
>
> Charlie is right about probing for support being fast. There is
> RISCV_DETECT_VECTOR_UNALIGNED_ACCESS to only detect support not the speed.
> I
> thought it might be a good idea to add this config option, but
> I'll just drop it to shorten this thread.
I actually agree with Charlie that the option really isn't needed, I was
just arguing against the justifications he gave. It being fairly fast
and not really much code to just check for a trap I think is enough to
not have much cause for the option.
I really think that we should be added checks for how well it
performs though. Unless it is supported by the hardware (which we cannot
know just by checking for traps, since the firmware could be emulating),
the performance is unlikely to make it perform better than using the same
instructions but aligned or sticking to using scalars.
IMO the options should be (provided we don't have emulation):
- RISCV_VECTOR_PROBE_UNALIGNED_ACCESS: Check for a trap and then bail
early if not supported. If supported, check the speed.
- RISCV_VECTOR_SLOW_UNALIGNED_ACCESS: Don't check for traps or the
speed, report slow. NONPORTABLE.
- RISCV_VECTOR_EFFICIENT_UNALIGNED_ACCESS: Don't check for traps or the
speed, report fast. NONPORTABLE.
Having two different options for "detection" and "probing" is just
confusing and RISCV_DETECT_VECTOR_UNALIGNED_ACCESS' two possible results
both effectively mean the same thing to me if I was writing a userspace
program that wanted to make use of vector optimisations.
> > > Additionally, what are we doing in the kernel if we detect that
> > > misaligned stuff isn't supported? Are we going to mandate that kernel
> > > code is aligned only
>
> As of now yes.
Seems like we already have in-kernel vector code that assumes misaligned
accesses are supported :)
> > > disable in-kernel vector or some other mechanism
> > > to make sure that things like crypto code don't have/introduce code
> > > that'll not run on these systems?
>
> I'm not too familiar with the uses of unaligned vector accesses, but if
> it significantly improves performance then I think there should be a faster
> unaligned access pathway, and a aligned access pathway, so both are
> supported. This also ties into your first question.
I dunno, I think we'll have enough bits of optional code for different
extensions, I don't want to have different codepaths for these kinds of
systems. As I said to Eric, I think the vector crypto stuff should just
get disabled once we detect that misaligned accesses are unsupported.
He seems unwilling to support systems that didn't implement misaligned
accesses in the crypto code anyway.
I'm tempted to say that we should force enable Zicclsm if we detect both
support for vector and scalar crypto, but it happens too late to fix
that up I think.
> > UNSUPPORTED will still be set by the quick probe so it would be possible
> > for the kernel/userspace to avoid running misaligned vector when it's
> > unsupported. Any kernel methods would probably want to always run
> > aligned vector unless misaligned support was determined to be FAST
> > anyway, I am doubtful that code will have different optimizations for
> > FAST, SLOW, and UNSUPPORTED but it is possible.
> >
> > I would prefer consistency between scalar and vector misaligned support,
> > but this is not a deal breaker for this patch. I am not convinced it is
> > the best choice, but I am okay with leaving this option in the kernel.
I would like consistency too :)
Thanks,
Conor.
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists