linux-kernel - Re: [PATCH] RISC-V: KVM: Redirect instruction access fault trap to guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAOnJCUJr8_EWrtU_9i1U+6KRNRhqyt__sS5J5Q+vueOfchNM2w@mail.gmail.com>
Date: Wed, 11 Dec 2024 00:00:28 -0800
From: Atish Patra <atishp@...shpatra.org>
To: Samuel Holland <samuel.holland@...ive.com>
Cc: Anup Patel <anup@...infault.org>, Quan Zhou <zhouquan@...as.ac.cn>, ajones@...tanamicro.com, 
	paul.walmsley@...ive.com, palmer@...belt.com, aou@...s.berkeley.edu, 
	linux-kernel@...r.kernel.org, linux-riscv@...ts.infradead.org, 
	kvm@...r.kernel.org, kvm-riscv@...ts.infradead.org
Subject: Re: [PATCH] RISC-V: KVM: Redirect instruction access fault trap to guest

On Sat, Sep 14, 2024 at 9:39 AM Samuel Holland
<samuel.holland@...ive.com> wrote:
>
> Hi Anup,
>
> On 2024-09-12 11:32 PM, Anup Patel wrote:
> > On Fri, Sep 13, 2024 at 6:09 AM Samuel Holland
> > <samuel.holland@...ive.com> wrote:
> >>
> >> On 2024-09-12 4:03 AM, Quan Zhou wrote:
> >>>
> >>> On 2024/8/29 14:20, zhouquan@...as.ac.cn wrote:
> >>>> From: Quan Zhou <zhouquan@...as.ac.cn>
> >>>>
> >>>> The M-mode redirects an unhandled instruction access
> >>>> fault trap back to S-mode when not delegating it to
> >>>> VS-mode(hedeleg). However, KVM running in HS-mode
> >>>> terminates the VS-mode software when back from M-mode.
> >>>>
> >>>> The KVM should redirect the trap back to VS-mode, and
> >>>> let VS-mode trap handler decide the next step.
> >>>>
> >>>> Signed-off-by: Quan Zhou <zhouquan@...as.ac.cn>
> >>>> ---
> >>>>   arch/riscv/kvm/vcpu_exit.c | 1 +
> >>>>   1 file changed, 1 insertion(+)
> >>>>
> >>>> diff --git a/arch/riscv/kvm/vcpu_exit.c b/arch/riscv/kvm/vcpu_exit.c
> >>>> index fa98e5c024b2..696b62850d0b 100644
> >>>> --- a/arch/riscv/kvm/vcpu_exit.c
> >>>> +++ b/arch/riscv/kvm/vcpu_exit.c
> >>>> @@ -182,6 +182,7 @@ int kvm_riscv_vcpu_exit(struct kvm_vcpu *vcpu, struct
> >>>> kvm_run *run,
> >>>>       ret = -EFAULT;
> >>>>       run->exit_reason = KVM_EXIT_UNKNOWN;
> >>>>       switch (trap->scause) {
> >>>> +    case EXC_INST_ACCESS:
> >>>
> >>> A gentle ping, the instruction access fault should be redirected to
> >>> VS-mode for handling, is my understanding correct?
> >>
> >> Yes, this looks correct. However, I believe it would be equivalent (and more
> >> efficient) to add EXC_INST_ACCESS to KVM_HEDELEG_DEFAULT in asm/kvm_host.h.
> >>
> >> I don't understand why some exceptions are delegated with hedeleg and others are
> >> caught and redirected here with no further processing. Maybe someone thought
> >> that it wasn't valid to set a bit in hedeleg if the corresponding bit was
> >> cleared in medeleg? But this doesn't make sense, as S-mode cannot know which
> >> bits are set in medeleg (maybe none are!).
> >>
> >> So the hypervisor must either:
> >>  1) assume M-mode firmware checks hedeleg and redirects exceptions to VS-mode
> >>     regardless of medeleg, in which case all four of these exceptions can be
> >>     moved to KVM_HEDELEG_DEFAULT and removed from this switch statement, or
> >>
> >>  2) assume M-mode might not check hedeleg and redirect exceptions to VS-mode,
> >>     and since no bits are guaranteed to be set in medeleg, any bit set in
> >>     hedeleg must _also_ be handled in the switch case here.
> >>
> >> Anup, Atish, thoughts?
> >
> > Any exception delegated to VS-mode via hedeleg means it is directly delivered
> > to VS-mode without any intervention of HS-mode. This aligns with the RISC-V
> > priv specification and there is no alternate semantics assumed by KVM RISC-V.
> >
> > At the moment, for KVM RISC-V we are converging towards the following
> > approach:
> >
> > 1) Only delegate "supervisor expected" traps to VS-mode via hedeleg
> > which supervisor software is generally expected to directly handle such
> > as breakpoint, user syscall, inst page fault, load page fault, and store
> > page fault.
> >
> > 2) Other "supervisor unexpected" traps are redirected to VS-mode via
> > software in HS-mode because these are not typically expected by supervisor
> > software and KVM RISC-V should at least gather some stats for such traps.
>
> Can you point me to where we collect stats for these traps? I don't see any code
> in kvm_riscv_vcpu_exit() that does this.
>

It doesn't exist today. But we should add it for debug purposes IMO.

> > Previously, we were redirecting such unexpect traps to KVM user space
> > where the KVM user space tool will simply dump the VCPU state and kill
> > the Guest/VM.
>
> Currently we have 5 exception types that go through software in HS-mode but
> never kill the guest: EXC_INST_ILLEGAL, EXC_LOAD_MISALIGNED,
> EXC_STORE_MISALIGNED, EXC_LOAD_ACCESS, and EXC_STORE_ACCESS. Are those
> considered "expected" or "unexpected"?
>
> > The inst misaligned trap was historically always set in hedeleg but we
> > should update it based on the above approach.
>
> What are the criteria for determining if a trap is "supervisor expected" or
> "supervisor unexpected"? Certainly any trap that can be triggered by misbehaved
> software in VU-mode should not kill the guest. Similarly, any trap that can be
> triggered by a misbehaved nested VS-mode guest should not kill the outer guest
> either.
>
> So the only reason I see for not delegating them is to collect stats, but I
> wonder if that is worth the performance cost. I would rather make misaligned
> loads/stores (for example) faster in the guest than have a count of them at the
> hypervisor level.
>

These are not in a hot path where updating a stats counter will affect
the performance.
I think we should use both kvm stats (for monitoring from the host)
and SBI PMU firmware counters (for monitoring within the guest)
which will help both guest and the host performance bottlenecks.

I will send a patch.

> Regards,
> Samuel
>


-- 
Regards,
Atish