[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <DBQ8LC4H0HVO.2DOI8X0CKOGM0@ventanamicro.com>
Date: Thu, 31 Jul 2025 14:24:52 +0200
From: Radim Krčmář <rkrcmar@...tanamicro.com>
To: "Palmer Dabbelt" <palmer@...belt.com>
Cc: <fustini@...nel.org>, "Bjorn Topel" <bjorn@...osinc.com>, "Alexandre
Ghiti" <alex@...ti.fr>, "Paul Walmsley" <paul.walmsley@...ive.com>,
<samuel.holland@...ive.com>, <dfustini@...storrent.com>,
<andybnac@...il.com>, "Conor Dooley" <conor.dooley@...rochip.com>,
<linux-riscv@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
<linux-riscv-bounces@...ts.infradead.org>
Subject: Re: [PATCH] riscv: Add sysctl to control discard of vstate during
syscall
2025-07-30T18:05:59-07:00, Palmer Dabbelt <palmer@...belt.com>:
> On Mon, 21 Jul 2025 07:54:25 PDT (-0700), rkrcmar@...tanamicro.com wrote:
>> 2025-07-21T14:35:38+02:00, Radim Krčmář <rkrcmar@...tanamicro.com>:
>>> Shouldn't the RISC-V Linux syscall ABI be defined somewhere?
>>
>> To clarify this point. My issue is with the following part in
>> Documentation/arch/riscv/vector.rst:
>>
>>>> As indicated by version 1.0 of the V extension [1], vector registers are
>>>> clobbered by system calls.
>>>> [...]
>>>> 1: https://github.com/riscv/riscv-v-spec/blob/master/calling-convention.adoc
>>
>> The ISA does not say that vector registers are clobbered by system
>> calls. All the ISA says is:
>>
>> "This Appendix is only a placeholder to help explain the conventions
>> used in the code examples, and is not considered frozen or
>> part of the ratification process. The official RISC-V psABI document
>> is being expanded to specify the vector calling conventions."
>
> It also says
>
> Executing a system call causes all caller-saved vector registers
> (v0-v31, vl, vtype) and vstart to become unspecied.
>
> in the ISA manual, a few sentences later in that page.
It also says
Most OSes will choose to either leave these registers intact or reset
them to their initial state to avoid leaking information across
process boundaries.
Both options make sense, but we're not doing either.
> So that's what
> we were trying to get at with the documentation pointer, but maybe it's
> better to have something more explicit like
>
> diff --git a/Documentation/arch/riscv/vector.rst b/Documentation/arch/riscv/vector.rst
> index 3987f5f76a9d..e8591660a7bb 100644
> --- a/Documentation/arch/riscv/vector.rst
> +++ b/Documentation/arch/riscv/vector.rst
> @@ -134,7 +134,10 @@ processes in form of sysctl knob:
> 3. Vector Register State Across System Calls
> ---------------------------------------------
>
> -As indicated by version 1.0 of the V extension [1], vector registers are
> -clobbered by system calls.
> +Linux adopts the syscall ABI proposed by version 1.0 of the V extension [1],
The whole section is just a non-normative convention for its own code
examples, so I wouldn't say the V extension proposed it for anyone.
We can just say what Linux does without referencing anything, because
nothing tells Linux what to do.
> +where vector registers are clobbered by system calls. Specifically
> +
> + Executing a system call causes all caller-saved vector registers
> + (v0-v31, vl, vtype) and vstart to become unspecied.
We still need to define which registers are caller-saved.
No vector registers are preserved in the current syscall ABI, so I'd
just omit "caller-saved", to define that all vector registers become
unspecified.
> 1: https://github.com/riscv/riscv-v-spec/blob/master/calling-convention.adoc
>
>> while the RISC-V psABI says:
>>
>> "The calling convention for system calls does not fall within the
>> scope of this document. Please refer to the documentation of the
>> RISC-V execution environment interface (e.g OS kernel ABI, SBI)."
>>
>> We made a circular dependency, misinterpreted the ISA, and probably
>> implemented a suboptimal syscall ABI -- preserving vector registers
>> seems strictly better.
>
> We'd really need userspace to have an ABI that preserves vector
> registers for it to be useful in the kernel.
I don't really like over-catering to the standard psABI -- it's a relic
of simpler times, and I hope programs will eventually be freed of it.
> As it stands there's
> pretty much nothing that's going to have useful vector state over a
> syscall, as they're almost always hidden behind some C function and
> those clobber the vector state.
Userspace doesn't even have to carry vector state over syscall -- just
using vector, doing syscall, using other vector, doing syscall is
hindered by the current design, because each syscall has to touch vector
registers for not good reason.
> I have a patch out for GCC that enables
> a system-wide vector ABI, but I don't have time to test/benchmark it so
> it's kind of hard to justify.
You mean enabling the alternative psABI for vectors?
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/712449f8efcf6b3acd9e2a2a7ddfe89486317877/riscv-cc.adoc#calling-convention-variant
> That said:
>
> My first guess here would be that trashing the V register
> state is still faster on the machines that triggered this patch, it's
> just that the way we're trashing it is slow. We're doing some wacky
> things in there (VILL, LMUL, clearing to -1), so it's not surprising
> that some implementations are slow on these routines.
I am afraid we might end up with a commmandline, DT, or
mvendorid+marchid+mimpid hint to pick the best method at runtime, or
boot time benchmarking in case it's not even known.
> This came up during the original patch and we decided to just go with
> this way (which is recommended by the ISA) until someone could
> demonstrate it's slow, so sounds like it's time to go revisit those.
>
> So I'd start with something like
>
> diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
> index b61786d43c20..1fba33e62d2b 100644
> --- a/arch/riscv/include/asm/vector.h
> +++ b/arch/riscv/include/asm/vector.h
> @@ -287,7 +287,6 @@ static inline void __riscv_v_vstate_discard(void)
> "vmv.v.i v8, -1\n\t"
> "vmv.v.i v16, -1\n\t"
> "vmv.v.i v24, -1\n\t"
> - "vsetvl %0, x0, %1\n\t"
> ".option pop\n\t"
> : "=&r" (vl) : "r" (vtype_inval));
>
> to try and see if we're tripping over bad implementation behavior, in
> which case we can just hide this all in the kernel. Then we can split
> out these performance issues from other things like lazy save/restore
> and a V-preserving uABI, as it stands this is all sort of getting mixed
> up.
Yeah, the discussion got a bit out of hand.
I don't see much point in doing minor changes to the current design, as
it isn't anywhere near the Pareto front.
If we want to touch vectors on syscalls, I think it makes sense to start
with Vivian's proposal -- eagerly initializing vectors in syscalls
provides at least some advantage when eventually doing a context switch.
(If the performance is still bad, then we can initialize lazily on vector
restore, which should be optimal for everything except programs that
want to preserve vectors across syscalls.)
>>> How come we could have broken it with 9657e9b7d253?
>>
>> We changed the ABI once, so maybe we can change it back?
>
> We didn't change the ABI, the documentation always said "vector registers are
> clobbered by system calls".
My bad, I didn't see the patch went in with the initial version, thanks.
Powered by blists - more mailing lists