linux-kernel - Re: SSE May Corrupt KVM's Context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <9290f53d-3545-4299-9781-c1c558f71158@rivosinc.com>
Date: Fri, 21 Nov 2025 15:22:40 +0100
From: Clément Léger <cleger@...osinc.com>
To: Zhanpeng Zhang <zhangzhanpeng.jasper@...edance.com>
Cc: Paul Walmsley <paul.walmsley@...ive.com>,
 Palmer Dabbelt <palmer@...belt.com>,
 "linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "linux-arm-kernel@...ts.infradead.org"
 <linux-arm-kernel@...ts.infradead.org>,
 Himanshu Chauhan <hchauhan@...tanamicro.com>,
 Anup Patel <apatel@...tanamicro.com>, 路旭
 <luxu.kernel@...edance.com>, Atish Patra <atishp@...shpatra.org>,
 Björn Töpel <bjorn@...osinc.com>,
 崔运辉 <cuiyunhui@...edance.com>,
 元竹 <yuanzhu@...edance.com>
Subject: Re: SSE May Corrupt KVM's Context



On 11/21/25 09:26, Zhanpeng Zhang wrote:
> Hi Clément, 
> 
> I encountered another SSE problem recently. Host SSE events may affect the guest. Neither PMU-SBI-SSE or KVM-SSE is enabled. Just using the simplest riscv_sse_test to trigger ecall SSE, and it will affect the KVM vcpu running in the background.

Hi Zhanpeng,

I'd be inclined to say this is most probably a problem in the SBI
implementation that incorrectly send or restore to/from virt context.
I'll try to reproduce it with OpenSBI in the upcoming days.

Thanks,

Clément

> 
> The following log is the output of KVM and QEMU when vcpu crashes:
> 
> [  152.228548] kvm [145]: VCPU exit error -14
> [  152.230664] kvm [145]: SEPC=0xffffffff80040280 SSTATUS=0x200004500 HSTATUS=0x200201100
> [  152.231868] kvm [145]: SCAUSE=0xf STVAL=0x3c8 HTVAL=0x0 HTINST=0x103023
> error: kvm run failed Bad address
>  pc       ffffffff80040280
>  mhartid  0000000000000000
>  mstatus  0000000200000100
>  mip      0000000000000000
>  mie      0000000000000000
>  mideleg  0000000000000000
>  medeleg  0000000000000000
>  mtvec    0000000000000000
>  mepc     0000000000000000
>  mcause   0000000000000000
>  mtval    0000000000000000
>  mscratch 0000000000000000
>  x0/zero  0000000000000000 x1/ra    ffffffff80c1ac2c x2/sp    ffffffff81e03cf0 x3/gp    ffffffff8201aa68
>  x4/tp    ffffffff81e0e0c0 x5/t0    ffffffff80b91a08 x6/t1    ffffaf80fee00000 x7/t2    0000000200000020
>  x8/s0    ffffffff81e03d50 x9/s1    0000000001000000 x10/a0   0000000000000000 x11/a1   0000000000000000
>  x12/a2   0000000001000000 x13/a3   ffffaf80ffe00000 x14/a4   0000000000000000 x15/a5   ffffaf7f80000000
>  x16/a6   0000000000000002 x17/a7   0000000000000002 x18/s2   ffffaf80fee00000 x19/s3   ffffffff81639170
>  x20/s4   ffffffff820200f8 x21/s5   0000000000000000 x22/s6   0000000000000000 x23/s7   0000000000000000
>  x24/s8   0000000000000000 x25/s9   0000000000000000 x26/s10  0000000000000000 x27/s11  0000000000000000
>  x28/t3   ffffffff80b85110 x29/t4   ffffffff8205d258 x30/t5   ffffffff8205d258 x31/t6   ffffffff8205d2a0
> [  152.312824] riscv_sse_test: FAILED: Failed to wait for event local_software_injected completion on CPU 2
> [  152.317933] riscv_sse_test: FAILED: Received SSE event -65536 on CPU 2 instead of 4
> 
> The following is the assembly context where vcpu crashes, which is the SAVE_GUEST_GPRS macro that runs when it get back to host(vcpu_switch.S: Lkvm_switch_return).
> The sscratch is set to 0(or other bad and low addresses) by mistake, causing a crash when storing the virtual machine context.
> 
>    0xffffffff8004027c <+252>:   csrrw   a0,sscratch,a0
>    0xffffffff80040280 <+256>:   sd      ra,968(a0)
>    0xffffffff80040284 <+260>:   sd      sp,976(a0)
>    0xffffffff80040288 <+264>:   sd      gp,984(a0)
>    0xffffffff8004028c <+268>:   sd      tp,992(a0)
>    0xffffffff80040290 <+272>:   sd      t0,1000(a0)
>    0xffffffff80040294 <+276>:   sd      t1,1008(a0)
> 
> This is a must-occur problem in my environment. As long as an idle VM is running in the background,  run the SSE test in loop on the host, and the vcpu will crash. Can this problem be reproduced in your environment?
> 
> Regards,
> Zhanpeng