lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAhSdy0TfpWQ-kC_gUUCU0oC5dR45A1v9q84H2Tj9A8kdO0d1A@mail.gmail.com>
Date: Wed, 30 Apr 2025 15:47:13 +0530
From: Anup Patel <anup@...infault.org>
To: Radim Krčmář <rkrcmar@...tanamicro.com>
Cc: Anup Patel <apatel@...tanamicro.com>, kvm-riscv@...ts.infradead.org, 
	kvm@...r.kernel.org, linux-riscv@...ts.infradead.org, 
	linux-kernel@...r.kernel.org, Atish Patra <atishp@...shpatra.org>, 
	Paul Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt <palmer@...belt.com>, 
	Albert Ou <aou@...s.berkeley.edu>, Alexandre Ghiti <alex@...ti.fr>, 
	Andrew Jones <ajones@...tanamicro.com>, Mayuresh Chitale <mchitale@...tanamicro.com>
Subject: Re: [PATCH 4/5] KVM: RISC-V: reset VCPU state when becoming runnable

On Wed, Apr 30, 2025 at 1:59 PM Radim Krčmář <rkrcmar@...tanamicro.com> wrote:
>
> 2025-04-30T10:56:35+05:30, Anup Patel <anup@...infault.org>:
> > On Wed, Apr 30, 2025 at 9:52 AM Anup Patel <anup@...infault.org> wrote:
> >>
> >> On Tue, Apr 29, 2025 at 9:51 PM Radim Krčmář <rkrcmar@...tanamicro.com> wrote:
> >> >
> >> > 2025-04-29T20:31:18+05:30, Anup Patel <anup@...infault.org>:
> >> > > On Tue, Apr 29, 2025 at 3:55 PM Radim Krčmář <rkrcmar@...tanamicro.com> wrote:
> >> > >>
> >> > >> 2025-04-29T11:25:35+05:30, Anup Patel <apatel@...tanamicro.com>:
> >> > >> > On Mon, Apr 28, 2025 at 11:15 PM Radim Krčmář <rkrcmar@...tanamicro.com> wrote:
> >> > >> >>
> >> > >> >> 2025-04-28T17:52:25+05:30, Anup Patel <anup@...infault.org>:
> >> > >> >> > On Thu, Apr 3, 2025 at 5:02 PM Radim Krčmář <rkrcmar@...tanamicro.com> wrote:
> >> > >> >> >> For a cleaner solution, we should add interfaces to perform the KVM-SBI
> >> > >> >> >> reset request on userspace demand.  I think it would also be much better
> >> > >> >> >> if userspace was in control of the post-reset state.
> >> > >> >> >
> >> > >> >> > Apart from breaking KVM user-space, this patch is incorrect and
> >> > >> >> > does not align with the:
> >> > >> >> > 1) SBI spec
> >> > >> >> > 2) OS boot protocol.
> >> > >> >> >
> >> > >> >> > The SBI spec only defines the entry state of certain CPU registers
> >> > >> >> > (namely, PC, A0, and A1) when CPU enters S-mode:
> >> > >> >> > 1) Upon SBI HSM start call from some other CPU
> >> > >> >> > 2) Upon resuming from non-retentive SBI HSM suspend or
> >> > >> >> >     SBI system suspend
> >> > >> >> >
> >> > >> >> > The S-mode entry state of the boot CPU is defined by the
> >> > >> >> > OS boot protocol and not by the SBI spec. Due to this, reason
> >> > >> >> > KVM RISC-V expects user-space to set up the S-mode entry
> >> > >> >> > state of the boot CPU upon system reset.
> >> > >> >>
> >> > >> >> We can handle the initial state consistency in other patches.
> >> > >> >> What needs addressing is a way to trigger the KVM reset from userspace,
> >> > >> >> even if only to clear the internal KVM state.
> >> > >> >>
> >> > >> >> I think mp_state is currently the best signalization that KVM should
> >> > >> >> reset, so I added it there.
> >> > >> >>
> >> > >> >> What would be your preferred interface for that?
> >> > >> >>
> >> > >> >
> >> > >> > Instead of creating a new interface, I would prefer that VCPU
> >> > >> > which initiates SBI System Reset should be resetted immediately
> >> > >> > in-kernel space before forwarding the system reset request to
> >> > >> > user space.
> >> > >>
> >> > >> The initiating VCPU might not be the boot VCPU.
> >> > >> It would be safer to reset all of them.
> >> > >
> >> > > I meant initiating VCPU and not the boot VCPU. Currently, the
> >> > > non-initiating VCPUs are already resetted by VCPU requests
> >> > > so nothing special needs to be done.
> >>
> >> There is no designated boot VCPU for KVM so let us only use the
> >> term "initiating" or "non-initiating" VCPUs in context of system reset.
>
> That is exactly how I use it.  Some VCPU will be the boot VCPU (the VCPU
> made runnable by KVM_SET_MP_STATE) and loaded with state from userspace.
>
> RISC-V doesn't guarantee that the boot VCPU is the reset initiating
> VCPU, so I think KVM should allow it.

We do allow any VCPU to be the boot VCPU. I am not sure why you
think otherwise.

>
> >> > Currently, we make the request only for VCPUs brought up by HSM -- the
> >> > non-boot VCPUs.  There is a single VCPU not being reset and resetting
> >> > the reset initiating VCPU changes nothing. e.g.
> >> >
> >> >   1) VCPU 1 initiates the reset through an ecall.
> >> >   2) All VCPUs are stopped and return to userspace.
> >>
> >> When all VCPUs are stopped, all VCPUs except VCPU1
> >> (in this example) will SLEEP because we do
> >> "kvm_make_all_cpus_request(vcpu->kvm, KVM_REQ_SLEEP)"
> >> so none of the VCPUs except VCPU1 (in this case) will
> >> return to userspace.
>
> Userspace should be able to do whatever it likes -- in my example, all
> the VCPUs are brought to userspace and a different boot VCPU is
> selected.

In your example, the VCPU1 (initiating VCPU) need not be the
boot VCPU after system reset. The user space can setup some
other VCPU as boot VCPU (by setting its MPSTATE, PC, A0,
and A1) and simply do ioctl_run() for the initiating VCPU without
changing its MPSTATE.

>
> (Perhaps userspace wanted to record their reset pre-reset state, or
>  maybe it really wants to boot with a designated VCPU.)
>
> >> >   3) Userspace prepares VCPU 0 as the boot VCPU.
> >> >   4) VCPU 0 executes without going through KVM reset paths.
> >>
> >> Userspace will see a system reset event exit for the
> >> initiating VCPU by that time all other VCPUs are already
> >> sleeping with mp_state == KVM_MP_STATE_STOPPED.
> >>
> >> >
> >> > The point of this patch is to reset the boot VCPU, so we reset the VCPU
> >> > that is made runnable by the KVM_SET_MP_STATE IOCTL.
> >>
> >> Like I said before, we don't need to do this. The initiating VCPU
> >> can be resetted just before exiting to user space for system reset
> >> event exit.
>
> You assume initiating VCPU == boot VCPU.
>
> We should prevent KVM_SET_MP_STATE IOCTL for all non-initiating VCPUs if
> we decide to accept the assumption.

There is no such assumption.

>
> I'd rather choose a different design, though.
>
> How about a new userspace interface for IOCTL reset?
> (Can be capability toggle for KVM_SET_MP_STATE or a straight new IOCTL.)
>
> That wouldn't "fix" current userspaces, but would significantly improve
> the sanity of the KVM interface.

I believe the current implementation needs a few improvements
that's all. We certainly don't need to introduce any new IOCTL.

Also, keep in mind that so far we have avoided any RISC-V
specific KVM IOCTLs and we should try to keep it that way
as long as we can.

Regards,
Anup

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ