[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aJpUWnYEL18dk4aC@x1>
Date: Mon, 11 Aug 2025 13:36:42 -0700
From: Drew Fustini <fustini@...nel.org>
To: Florian Weimer <fweimer@...hat.com>
Cc: Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
Alexandre Ghiti <alex@...ti.fr>,
Samuel Holland <samuel.holland@...ive.com>,
Björn Töpel <bjorn@...osinc.com>,
Andy Chiu <andybnac@...il.com>,
Conor Dooley <conor.dooley@...rochip.com>,
linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
Drew Fustini <dfustini@...storrent.com>
Subject: Re: [PATCH v2] riscv: Add sysctl to control discard of vstate during
syscall
On Sun, Aug 10, 2025 at 09:45:45AM +0200, Florian Weimer wrote:
> * Drew Fustini:
>
> > On Sat, Aug 09, 2025 at 10:40:46AM +0200, Florian Weimer wrote:
> >> * Drew Fustini:
> >>
> >> > From: Drew Fustini <dfustini@...storrent.com>
> >> >
> >> > Clobbering the vector registers can significantly increase system call
> >> > latency for some implementations. To mitigate this performance impact, a
> >> > sysctl knob is provided that controls whether the vector state is
> >> > discarded in the syscall path:
> >> >
> >> > /proc/sys/abi/riscv_v_vstate_discard
> >> >
> >> > Valid values are:
> >> >
> >> > 0: Vector state is not always clobbered in all syscalls
> >> > 1: Mandatory clobbering of vector state in all syscalls
> >> >
> >> > The initial state is controlled by CONFIG_RISCV_ISA_V_VSTATE_DISCARD.
> >>
> >> Can this be put into the system call number instead, or make it specific
> >> to some system calls in other ways?
> >
> > Do you mean the control the initial state of the sysctl, or not having a
> > sysctl for discard behavior at all?
>
> It's seems rather strange to have a sysctl for such an ABI change
> because it really has to be a per-process property.
The reason for sysctl is that I want a means to let a system to opt out
of clobbering vector state on the syscall entry path. This is because it
adds significant overhead for some implementations. For example, it
results in a 25% longer syscall duration on the X280 core.
I would be in favor of reverting the mandatory clobbering behavior, but
Palmer says that it is useful for test suites. Since revert isn't an
option, I want a system wide policy control like this sysctl. It does
seem like there could be some advantages to per-process control but I
think that delves into ABI changes which I feel is a separate issue from
a system wide knob for "always clobber"/"do not always clobber".
>
> >> I think C libraries can use this optimization for their system calls
> >> (after adjusting the assembler clobbers) because the vector state is
> >> caller-saved in the standard calling convention. But there is backwards
> >> compatibility impact for turning this on for the entire process.
> >
> > The focus I have right now is allowing users to avoid the delay in
> > syscall entry on implementations where clobbering is slow. Palmer had
> > mentioned in my v1 [1] that he has 'a patch out for GCC that enables a
> > system-wide vector ABI, but I don't have time to test/benchmark it so
> > it's kind of hard to justify'. It seems like creating a new ABI where
> > the vector registers are preserved across syscalls could be useful, but
> > I think it would be best to handle that possiblity later on.
>
> I'm confused. Current glibc assumes that vector registers are preserved
> across system calls because the assembler clobbers do not mention them.
> Similar inline assembly probably has ended up in other projects, too.
> It works by accident if glibc is compiled for a non-vector target, or if
> it so happens that GCC never keeps vector registers alive across system
> calls.
I wasn't trying to make any ABI changes with this sysctl patch. The
riscv kernel documentation states vector state is not preserved across
syscalls. I am not trying to change that policy.
Around the same time that Palmer added that statement to the vector
documentation, Bjorn added the code that always clobbers the vector
registers on syscall entry. This was done in order to ensure programs
were not relying on vector state being preserved.
At the time 2 years ago, Palmer and Bjorn talked about how this could
be revisted if it turns out the clobbering process ended up being slow
on real hardware. This patch is my attempt to allow platforms with slow
vstate clobbering to opt out this strict mandatory clobbering on syscall
entry.
Thanks,
Drew
Powered by blists - more mailing lists