[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9290f53d-3545-4299-9781-c1c558f71158@rivosinc.com>
Date: Fri, 21 Nov 2025 15:22:40 +0100
From: Clément Léger <cleger@...osinc.com>
To: Zhanpeng Zhang <zhangzhanpeng.jasper@...edance.com>
Cc: Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
"linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
Himanshu Chauhan <hchauhan@...tanamicro.com>,
Anup Patel <apatel@...tanamicro.com>, 路旭
<luxu.kernel@...edance.com>, Atish Patra <atishp@...shpatra.org>,
Björn Töpel <bjorn@...osinc.com>,
崔运辉 <cuiyunhui@...edance.com>,
元竹 <yuanzhu@...edance.com>
Subject: Re: SSE May Corrupt KVM's Context
On 11/21/25 09:26, Zhanpeng Zhang wrote:
> Hi Clément,
>
> I encountered another SSE problem recently. Host SSE events may affect the guest. Neither PMU-SBI-SSE or KVM-SSE is enabled. Just using the simplest riscv_sse_test to trigger ecall SSE, and it will affect the KVM vcpu running in the background.
Hi Zhanpeng,
I'd be inclined to say this is most probably a problem in the SBI
implementation that incorrectly send or restore to/from virt context.
I'll try to reproduce it with OpenSBI in the upcoming days.
Thanks,
Clément
>
> The following log is the output of KVM and QEMU when vcpu crashes:
>
> [ 152.228548] kvm [145]: VCPU exit error -14
> [ 152.230664] kvm [145]: SEPC=0xffffffff80040280 SSTATUS=0x200004500 HSTATUS=0x200201100
> [ 152.231868] kvm [145]: SCAUSE=0xf STVAL=0x3c8 HTVAL=0x0 HTINST=0x103023
> error: kvm run failed Bad address
> pc ffffffff80040280
> mhartid 0000000000000000
> mstatus 0000000200000100
> mip 0000000000000000
> mie 0000000000000000
> mideleg 0000000000000000
> medeleg 0000000000000000
> mtvec 0000000000000000
> mepc 0000000000000000
> mcause 0000000000000000
> mtval 0000000000000000
> mscratch 0000000000000000
> x0/zero 0000000000000000 x1/ra ffffffff80c1ac2c x2/sp ffffffff81e03cf0 x3/gp ffffffff8201aa68
> x4/tp ffffffff81e0e0c0 x5/t0 ffffffff80b91a08 x6/t1 ffffaf80fee00000 x7/t2 0000000200000020
> x8/s0 ffffffff81e03d50 x9/s1 0000000001000000 x10/a0 0000000000000000 x11/a1 0000000000000000
> x12/a2 0000000001000000 x13/a3 ffffaf80ffe00000 x14/a4 0000000000000000 x15/a5 ffffaf7f80000000
> x16/a6 0000000000000002 x17/a7 0000000000000002 x18/s2 ffffaf80fee00000 x19/s3 ffffffff81639170
> x20/s4 ffffffff820200f8 x21/s5 0000000000000000 x22/s6 0000000000000000 x23/s7 0000000000000000
> x24/s8 0000000000000000 x25/s9 0000000000000000 x26/s10 0000000000000000 x27/s11 0000000000000000
> x28/t3 ffffffff80b85110 x29/t4 ffffffff8205d258 x30/t5 ffffffff8205d258 x31/t6 ffffffff8205d2a0
> [ 152.312824] riscv_sse_test: FAILED: Failed to wait for event local_software_injected completion on CPU 2
> [ 152.317933] riscv_sse_test: FAILED: Received SSE event -65536 on CPU 2 instead of 4
>
> The following is the assembly context where vcpu crashes, which is the SAVE_GUEST_GPRS macro that runs when it get back to host(vcpu_switch.S: Lkvm_switch_return).
> The sscratch is set to 0(or other bad and low addresses) by mistake, causing a crash when storing the virtual machine context.
>
> 0xffffffff8004027c <+252>: csrrw a0,sscratch,a0
> 0xffffffff80040280 <+256>: sd ra,968(a0)
> 0xffffffff80040284 <+260>: sd sp,976(a0)
> 0xffffffff80040288 <+264>: sd gp,984(a0)
> 0xffffffff8004028c <+268>: sd tp,992(a0)
> 0xffffffff80040290 <+272>: sd t0,1000(a0)
> 0xffffffff80040294 <+276>: sd t1,1008(a0)
>
> This is a must-occur problem in my environment. As long as an idle VM is running in the background, run the SSE test in loop on the host, and the vcpu will crash. Can this problem be reproduced in your environment?
>
> Regards,
> Zhanpeng
Powered by blists - more mailing lists