[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250719033912.1313955-1-fustini@kernel.org>
Date: Fri, 18 Jul 2025 20:39:13 -0700
From: Drew Fustini <fustini@...nel.org>
To: Palmer Dabbelt <palmer@...belt.com>,
Björn Töpel <bjorn@...osinc.com>,
Alexandre Ghiti <alex@...ti.fr>,
Paul Walmsley <paul.walmsley@...ive.com>,
Samuel Holland <samuel.holland@...ive.com>,
Drew Fustini <dfustini@...storrent.com>,
Andy Chiu <andybnac@...il.com>,
Conor Dooley <conor.dooley@...rochip.com>,
linux-riscv@...ts.infradead.org,
linux-kernel@...r.kernel.org
Cc: Drew Fustini <fustini@...nel.org>
Subject: [PATCH] riscv: Add sysctl to control discard of vstate during syscall
From: Drew Fustini <dfustini@...storrent.com>
Clobbering the vector registers can significantly increase system call
latency for some implementations. To mitigate this performance impact, a
policy mechanism is provided to administrators, distro maintainers, and
developers to control vector state discard in the form of a sysctl knob:
/proc/sys/abi/riscv_v_vstate_discard
Valid values are:
0: Do not discard vector state during syscall
1: Discard vector state during syscall
The initial state is controlled by CONFIG_RISCV_ISA_V_VSTATE_DISCARD.
Fixes: 9657e9b7d253 ("riscv: Discard vector state on syscalls")
Signed-off-by: Drew Fustini <dfustini@...storrent.com>
---
Documentation/arch/riscv/vector.rst | 15 +++++++++++++++
arch/riscv/Kconfig | 10 ++++++++++
arch/riscv/include/asm/vector.h | 4 ++++
arch/riscv/kernel/vector.c | 16 +++++++++++++++-
4 files changed, 44 insertions(+), 1 deletion(-)
I've tested the impact of riscv_v_vstate_discard() on the SiFive X280
cores [1] in the Tenstorrent Blackhole SoC [2]. The results from the
Blackhole P100 [3] card show that discarding the vector registers
increases null syscall latency by 25%.
The null syscall program [4] executes vsetvli and then calls getppid()
in a loop. The average duration of getppid() is 198 ns when registers
are clobbered in riscv_v_vstate_discard(). The average duration drops
to 149 ns when riscv_v_vstate_discard() skips clobbering the registers
as result of riscv_v_vstate_discard being set to 0.
$ sudo sysctl abi.riscv_v_vstate_discard=1
abi.riscv_v_vstate_discard = 1
$ ./null_syscall --vsetvli
vsetvli complete
iterations: 1000000000
duration: 198 seconds
avg latency: 198.73 ns
$ sudo sysctl abi.riscv_v_vstate_discard=0
abi.riscv_v_vstate_discard = 0
$ ./null_syscall --vsetvli
vsetvli complete
iterations: 1000000000
duration: 149 seconds
avg latency: 149.89 ns
I'm testing on the tt-blackhole-v6.16-rc1_vstate_discard [5] branch that
has 13 patches, including this one, on top of v6.16-rc1. Most are simple
yaml patches for dt bindings along with dts files and a bespoke network
driver. I don't think the other patches are relevant to this discussion.
This patch applies clean on its own to riscv/for-next and next-20250718.
[1] https://www.sifive.com/cores/intelligence-x200-series
[2] https://tenstorrent.com/en/hardware/blackhole
[3] https://github.com/tenstorrent/tt-bh-linux
[4] https://gist.github.com/tt-fustini/ab9b217756912ce75522b3cce11d0d58
[5] https://github.com/tenstorrent/linux/tree/tt-blackhole-v6.16-rc1_vstate_discard
diff --git a/Documentation/arch/riscv/vector.rst b/Documentation/arch/riscv/vector.rst
index 3987f5f76a9d..1edbce436015 100644
--- a/Documentation/arch/riscv/vector.rst
+++ b/Documentation/arch/riscv/vector.rst
@@ -137,4 +137,19 @@ processes in form of sysctl knob:
As indicated by version 1.0 of the V extension [1], vector registers are
clobbered by system calls.
+Clobbering the vector registers can significantly increase system call latency
+for some implementations. To mitigate the performance impact, a policy mechanism
+is provided to the administrators, distro maintainers, and developers to control
+the vstate discard in the form of a sysctl knob:
+
+* /proc/sys/abi/riscv_v_vstate_discard
+
+ Valid values are:
+
+ * 0: Do not discard vector state during syscall
+ * 1: Discard vector state during syscall
+
+ Reading this file returns the current discard behavior. The initial state is
+ controlled by CONFIG_RISCV_ISA_V_VSTATE_DISCARD.
+
1: https://github.com/riscv/riscv-v-spec/blob/master/calling-convention.adoc
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 0aeee50da016..c0039f21d1f0 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -655,6 +655,16 @@ config RISCV_ISA_V_DEFAULT_ENABLE
If you don't know what to do here, say Y.
+config RISCV_ISA_V_VSTATE_DISCARD
+ bool "Enable Vector state discard by default"
+ depends on RISCV_ISA_V
+ default n
+ help
+ Say Y here if you want to enable Vector state discard on syscall.
+ Otherwise, userspace has to enable it via the sysctl interface.
+
+ If you don't know what to do here, say N.
+
config RISCV_ISA_V_UCOPY_THRESHOLD
int "Threshold size for vectorized user copies"
depends on RISCV_ISA_V
diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h
index 45c9b426fcc5..77991013216b 100644
--- a/arch/riscv/include/asm/vector.h
+++ b/arch/riscv/include/asm/vector.h
@@ -40,6 +40,7 @@
_res; \
})
+extern bool riscv_v_vstate_discard_ctl;
extern unsigned long riscv_v_vsize;
int riscv_v_setup_vsize(void);
bool insn_is_vector(u32 insn_buf);
@@ -270,6 +271,9 @@ static inline void __riscv_v_vstate_discard(void)
{
unsigned long vl, vtype_inval = 1UL << (BITS_PER_LONG - 1);
+ if (READ_ONCE(riscv_v_vstate_discard_ctl) == 0)
+ return;
+
riscv_v_enable();
if (has_xtheadvector())
asm volatile (THEAD_VSETVLI_T4X0E8M8D1 : : : "t4");
diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c
index 184f780c932d..7a4c209ad337 100644
--- a/arch/riscv/kernel/vector.c
+++ b/arch/riscv/kernel/vector.c
@@ -26,6 +26,7 @@ static struct kmem_cache *riscv_v_user_cachep;
static struct kmem_cache *riscv_v_kernel_cachep;
#endif
+bool riscv_v_vstate_discard_ctl = IS_ENABLED(CONFIG_RISCV_ISA_V_VSTATE_DISCARD);
unsigned long riscv_v_vsize __read_mostly;
EXPORT_SYMBOL_GPL(riscv_v_vsize);
@@ -307,11 +308,24 @@ static const struct ctl_table riscv_v_default_vstate_table[] = {
},
};
+static const struct ctl_table riscv_v_vstate_discard_table[] = {
+ {
+ .procname = "riscv_v_vstate_discard",
+ .data = &riscv_v_vstate_discard_ctl,
+ .maxlen = sizeof(riscv_v_vstate_discard_ctl),
+ .mode = 0644,
+ .proc_handler = proc_dobool,
+ },
+};
+
static int __init riscv_v_sysctl_init(void)
{
- if (has_vector() || has_xtheadvector())
+ if (has_vector() || has_xtheadvector()) {
if (!register_sysctl("abi", riscv_v_default_vstate_table))
return -EINVAL;
+ if (!register_sysctl("abi", riscv_v_vstate_discard_table))
+ return -EINVAL;
+ }
return 0;
}
--
2.34.1
Powered by blists - more mailing lists