[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D2GR78QR1Y7K.3I08I56HLWKFT@gmail.com>
Date: Thu, 04 Jul 2024 22:29:23 +1000
From: "Nicholas Piggin" <npiggin@...il.com>
To: "Gautam Menghani" <gautam@...ux.ibm.com>, <mpe@...erman.id.au>,
<christophe.leroy@...roup.eu>, <naveen.n.rao@...ux.ibm.com>
Cc: <linuxppc-dev@...ts.ozlabs.org>, <kvm@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 1/2] Revert "KVM: PPC: Book3S HV Nested: Stop
forwarding all HFUs to L1"
On Fri Jun 28, 2024 at 4:03 AM AEST, Gautam Menghani wrote:
> This reverts commit 7c3ded5735141ff4d049747c9f76672a8b737c49.
>
> On PowerNV, when a nested guest tries to use a feature prohibited by
> HFSCR, the nested hypervisor (L1) should get a H_FAC_UNAVAILABLE trap
> and then L1 can emulate the feature. But with the change introduced by
> commit 7c3ded573514 ("KVM: PPC: Book3S HV Nested: Stop forwarding all HFUs to L1")
> the L1 ends up getting a H_EMUL_ASSIST because of which, the L1 ends up
> injecting a SIGILL when L2 (nested guest) tries to use doorbells.
Yeah, we struggled to come up with a coherent story for this kind of
compatibility and mismatched feature handling between L0 and L1.
The L1 doorbell emulation shows a legitimate case the L1 wants to see
the HFAC to emulate it and the L0 does not permit the L1 to set it for
the L2.
Actually the L0 could just permit it (even if the L0 wanted to emulate
doorbells for the L1, it could still allow the L2 to run with doorbells
if that's what the L1 asked for). That would also solve this problem,
but there is a potential future hardware change where doorbells will be
able to address any thread in the core even in "LPAR-per-thread" mode
and the hypervisor *must* disable the doorbell HFSCR to the guest if it
runs in KVM style that schedules LPARs on a per-thread basis instead of
per-core. In that case the L0 must not permit the L2 to run with HFSCR
set. So this approach actually works better there.
In other cases where the L0 might deliberately prohibit some facility
in a way that we don't want the L1 to see HFAC. I think we just
cross that bridge when it comes. I'm sure the L0 would really need to
advertise that to the L1 properly via device-tree or similar, and we
could special case the HFAC->HEAI if necessary then.
Reviewed-by: Nicholas Piggin <npiggin@...il.com>
>
> Signed-off-by: Gautam Menghani <gautam@...ux.ibm.com>
> ---
> arch/powerpc/kvm/book3s_hv.c | 31 ++-----------------------------
> 1 file changed, 2 insertions(+), 29 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index daaf7faf21a5..cea28ac05923 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -2052,36 +2052,9 @@ static int kvmppc_handle_nested_exit(struct kvm_vcpu *vcpu)
> fallthrough; /* go to facility unavailable handler */
> #endif
>
> - case BOOK3S_INTERRUPT_H_FAC_UNAVAIL: {
> - u64 cause = vcpu->arch.hfscr >> 56;
> -
> - /*
> - * Only pass HFU interrupts to the L1 if the facility is
> - * permitted but disabled by the L1's HFSCR, otherwise
> - * the interrupt does not make sense to the L1 so turn
> - * it into a HEAI.
> - */
> - if (!(vcpu->arch.hfscr_permitted & (1UL << cause)) ||
> - (vcpu->arch.nested_hfscr & (1UL << cause))) {
> - ppc_inst_t pinst;
> - vcpu->arch.trap = BOOK3S_INTERRUPT_H_EMUL_ASSIST;
> -
> - /*
> - * If the fetch failed, return to guest and
> - * try executing it again.
> - */
> - r = kvmppc_get_last_inst(vcpu, INST_GENERIC, &pinst);
> - vcpu->arch.emul_inst = ppc_inst_val(pinst);
> - if (r != EMULATE_DONE)
> - r = RESUME_GUEST;
> - else
> - r = RESUME_HOST;
> - } else {
> - r = RESUME_HOST;
> - }
> -
> + case BOOK3S_INTERRUPT_H_FAC_UNAVAIL:
> + r = RESUME_HOST;
> break;
> - }
>
> case BOOK3S_INTERRUPT_HV_RM_HARD:
> vcpu->arch.trap = 0;
Powered by blists - more mailing lists