linux-kernel - Re: [PATCH v6 1/5] riscv: save the SR

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aCyythZJ2u0SbXVb@debug.ba.rivosinc.com>
Date: Tue, 20 May 2025 09:49:58 -0700
From: Deepak Gupta <debug@...osinc.com>
To: Cyril Bur <cyrilbur@...storrent.com>
Cc: palmer@...belt.com, aou@...s.berkeley.edu, paul.walmsley@...ive.com,
	charlie@...osinc.com, jrtc27@...c27.com, ben.dooks@...ethink.co.uk,
	alex@...ti.fr, linux-riscv@...ts.infradead.org,
	linux-kernel@...r.kernel.org, jszhang@...nel.org,
	syzbot+e74b94fe601ab9552d69@...kaller.appspotmail.com
Subject: Re: [PATCH v6 1/5] riscv: save the SR_SUM status over switches

I did give this patch my RB and had planned to come back to it to see
if it impacts cfi related patches. Thanks to alex for brinigng to my
attention again. As it stands today, it doesn't impact cfi related
changes but I've some concerns.

Overall I do agree we should reduce number of SSTATUS accesses.

Couple of questions on introducing new `sstatus` field (inline)

On Tue, Apr 22, 2025 at 04:01:35PM -0700, Deepak Gupta wrote:
>On Thu, Apr 10, 2025 at 07:05:22AM +0000, Cyril Bur wrote:
>>From: Ben Dooks <ben.dooks@...ethink.co.uk>
>>
>>When threads/tasks are switched we need to ensure the old execution's
>>SR_SUM state is saved and the new thread has the old SR_SUM state
>>restored.
>>
>>The issue was seen under heavy load especially with the syz-stress tool
>>running, with crashes as follows in schedule_tail:
>>
>>Unable to handle kernel access to user memory without uaccess routines
>>at virtual address 000000002749f0d0
>>Oops [#1]
>>Modules linked in:
>>CPU: 1 PID: 4875 Comm: syz-executor.0 Not tainted
>>5.12.0-rc2-syzkaller-00467-g0d7588ab9ef9 #0
>>Hardware name: riscv-virtio,qemu (DT)
>>epc : schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
>>ra : task_pid_vnr include/linux/sched.h:1421 [inline]
>>ra : schedule_tail+0x70/0xb2 kernel/sched/core.c:4264
>>epc : ffffffe00008c8b0 ra : ffffffe00008c8ae sp : ffffffe025d17ec0
>>gp : ffffffe005d25378 tp : ffffffe00f0d0000 t0 : 0000000000000000
>>t1 : 0000000000000001 t2 : 00000000000f4240 s0 : ffffffe025d17ee0
>>s1 : 000000002749f0d0 a0 : 000000000000002a a1 : 0000000000000003
>>a2 : 1ffffffc0cfac500 a3 : ffffffe0000c80cc a4 : 5ae9db91c19bbe00
>>a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffe000082eba
>>s2 : 0000000000040000 s3 : ffffffe00eef96c0 s4 : ffffffe022c77fe0
>>s5 : 0000000000004000 s6 : ffffffe067d74e00 s7 : ffffffe067d74850
>>s8 : ffffffe067d73e18 s9 : ffffffe067d74e00 s10: ffffffe00eef96e8
>>s11: 000000ae6cdf8368 t3 : 5ae9db91c19bbe00 t4 : ffffffc4043cafb2
>>t5 : ffffffc4043cafba t6 : 0000000000040000
>>status: 0000000000000120 badaddr: 000000002749f0d0 cause:
>>000000000000000f
>>Call Trace:
>>[<ffffffe00008c8b0>] schedule_tail+0x72/0xb2 kernel/sched/core.c:4264
>>[<ffffffe000005570>] ret_from_exception+0x0/0x14
>>Dumping ftrace buffer:
>>  (ftrace buffer empty)
>>---[ end trace b5f8f9231dc87dda ]---
>>
>>The issue comes from the put_user() in schedule_tail
>>(kernel/sched/core.c) doing the following:
>>
>>asmlinkage __visible void schedule_tail(struct task_struct *prev)
>>{
>>...
>>       if (current->set_child_tid)
>>               put_user(task_pid_vnr(current), current->set_child_tid);
>>...
>>}
>>
>>the put_user() macro causes the code sequence to come out as follows:
>>
>>1:	__enable_user_access()
>>2:	reg = task_pid_vnr(current);
>>3:	*current->set_child_tid = reg;
>>4:	__disable_user_access()
>>
>>The problem is that we may have a sleeping function as argument which
>>could clear SR_SUM causing the panic above. This was fixed by
>>evaluating the argument of the put_user() macro outside the user-enabled
>>section in commit 285a76bb2cf5 ("riscv: evaluate put_user() arg before
>>enabling user access")"
>>
>>In order for riscv to take advantage of unsafe_get/put_XXX() macros and
>>to avoid the same issue we had with put_user() and sleeping functions we
>>must ensure code flow can go through switch_to() from within a region of
>>code with SR_SUM enabled and come back with SR_SUM still enabled. This
>>patch addresses the problem allowing future work to enable full use of
>>unsafe_get/put_XXX() macros without needing to take a CSR bit flip cost
>>on every access. Make switch_to() save and restore SR_SUM.
>>
>>Reported-by: syzbot+e74b94fe601ab9552d69@...kaller.appspotmail.com
>>Signed-off-by: Ben Dooks <ben.dooks@...ethink.co.uk>
>>Signed-off-by: Cyril Bur <cyrilbur@...storrent.com>
>>---
>>arch/riscv/include/asm/processor.h | 1 +
>>arch/riscv/kernel/asm-offsets.c    | 5 +++++
>>arch/riscv/kernel/entry.S          | 8 ++++++++
>>3 files changed, 14 insertions(+)
>>
>>diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
>>index 5f56eb9d114a..58fd11c89fe9 100644
>>--- a/arch/riscv/include/asm/processor.h
>>+++ b/arch/riscv/include/asm/processor.h
>>@@ -103,6 +103,7 @@ struct thread_struct {
>>	struct __riscv_d_ext_state fstate;
>>	unsigned long bad_cause;
>>	unsigned long envcfg;
>>+	unsigned long status;

Do we really need a new member field in `thread_struct`. We already have
`sstatus` in `pt_regs` which reflects overall execution environment situation
for current thread. This gets saved and restored on trap entry and exit.

If we put `status` in `thread_struct` it creates ambiguity in terms of which
`status` to save to and pick from from future maintainibility purposes as the
fields get introduced to this CSR.

Why can't we access current trap frame's `sstatus` image in `__switch_to` to
save and restore?

Let me know if I am missing something obvious here. If there is a complication,
I am missing here and we do end up using this member field, I would rename it
to something like `status_kernel` to reflect that. So that future changes are
cognizant of the fact that we have split `status`. One for kernel execution env
per thread and one for controlling user execution env per thread.


>>	u32 riscv_v_flags;
>>	u32 vstate_ctrl;
>>	struct __riscv_v_ext_state vstate;
>>diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
>>index 16490755304e..969c65b1fe41 100644
>>--- a/arch/riscv/kernel/asm-offsets.c
>>+++ b/arch/riscv/kernel/asm-offsets.c
>>@@ -34,6 +34,7 @@ void asm_offsets(void)
>>	OFFSET(TASK_THREAD_S9, task_struct, thread.s[9]);
>>	OFFSET(TASK_THREAD_S10, task_struct, thread.s[10]);
>>	OFFSET(TASK_THREAD_S11, task_struct, thread.s[11]);