[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <rbx6n7cqmslnj4re44hb6hdqcuhfbqvmxo5ebsrk2dqia3hj2w@jljlwne2nzvf>
Date: Wed, 30 Oct 2024 19:03:13 +0530
From: Gautam Menghani <gautam@...ux.ibm.com>
To: Nicholas Piggin <npiggin@...il.com>
Cc: mpe@...erman.id.au, christophe.leroy@...roup.eu,
naveen.n.rao@...ux.ibm.com, linuxppc-dev@...ts.ozlabs.org,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 2/2] arch/powerpc/kvm: Fix doorbells for nested KVM
guests on PowerNV
On Thu, Jul 04, 2024 at 10:10:05PM +1000, Nicholas Piggin wrote:
> On Fri Jun 28, 2024 at 4:03 AM AEST, Gautam Menghani wrote:
> > commit 6398326b9ba1("KVM: PPC: Book3S HV P9: Stop using vc->dpdes")
> > introduced an optimization to use only vcpu->doorbell_request for SMT
> > emulation for Power9 and above guests, but the code for nested guests
> > still relies on the old way of handling doorbells, due to which an L2
> > guest cannot be booted with XICS with SMT>1. The command to repro
> > this issue is:
> >
> > qemu-system-ppc64 \
> > -drive file=rhel.qcow2,format=qcow2 \
> > -m 20G \
> > -smp 8,cores=1,threads=8 \
> > -cpu host \
> > -nographic \
> > -machine pseries,ic-mode=xics -accel kvm
> >
> > Fix the plumbing to utilize vcpu->doorbell_request instead of vcore->dpdes
> > on P9 and above.
> >
> > Fixes: 6398326b9ba1 ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes")
> > Signed-off-by: Gautam Menghani <gautam@...ux.ibm.com>
> > ---
> > arch/powerpc/kvm/book3s_hv.c | 9 ++++++++-
> > arch/powerpc/kvm/book3s_hv_nested.c | 20 ++++++++++++++++----
> > 2 files changed, 24 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> > index cea28ac05923..0586fa636707 100644
> > --- a/arch/powerpc/kvm/book3s_hv.c
> > +++ b/arch/powerpc/kvm/book3s_hv.c
> > @@ -4178,6 +4178,9 @@ static int kvmhv_vcpu_entry_p9_nested(struct kvm_vcpu *vcpu, u64 time_limit, uns
> > }
> > hvregs.hdec_expiry = time_limit;
> >
> > + // clear doorbell bit as hvregs already has the info
> > + vcpu->arch.doorbell_request = 0;
> > +
> > /*
> > * When setting DEC, we must always deal with irq_work_raise
> > * via NMI vs setting DEC. The problem occurs right as we
> > @@ -4694,6 +4697,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
> > struct kvm_nested_guest *nested = vcpu->arch.nested;
> > unsigned long flags;
> > u64 tb;
> > + bool doorbell_pending;
> >
> > trace_kvmppc_run_vcpu_enter(vcpu);
> >
> > @@ -4752,6 +4756,9 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
> > */
> > smp_mb();
> >
> > + doorbell_pending = !cpu_has_feature(CPU_FTR_ARCH_300) &&
> > + vcpu->arch.doorbell_request;
>
> Hmm... is the feature test flipped here?
Sorry for responding late, I got involved in some other things.
Yes I think I got that part wrong, I guess it should've been
doorbell_pending = !cpu_has_feature(CPU_FTR_HVMODE) &&
vcpu->arch.doorbell_request;
The objective of introducing this is to avoid returning to L1 midway
when L0 is about to run L2. The issue is that if L1 does H_ENTER_NESTED
and there is a doorbell for L2, this condition in kvmhv_run_single_vcpu
will cause L0 to abort and go back to L1:
} else if (vcpu->arch.pending_exceptions ||
vcpu->arch.doorbell_request ||
xive_interrupt_pending(vcpu)) {
vcpu->arch.ret = RESUME_HOST;
goto out;
}
Earlier, vc->dpdes was used to pass around doorbell state, that's why
this condition did not cause problems, until
6398326b9ba1 ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes")
>
> > +
> > if (!nested) {
> > kvmppc_core_prepare_to_enter(vcpu);
> > if (test_bit(BOOK3S_IRQPRIO_EXTERNAL,
> > @@ -4769,7 +4776,7 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
> > lpcr |= LPCR_MER;
> > }
> > } else if (vcpu->arch.pending_exceptions ||
> > - vcpu->arch.doorbell_request ||
> > + doorbell_pending ||
> > xive_interrupt_pending(vcpu)) {
> > vcpu->arch.ret = RESUME_HOST;
> > goto out;
> > diff --git a/arch/powerpc/kvm/book3s_hv_nested.c b/arch/powerpc/kvm/book3s_hv_nested.c
> > index 05f5220960c6..b34eefa6b268 100644
> > --- a/arch/powerpc/kvm/book3s_hv_nested.c
> > +++ b/arch/powerpc/kvm/book3s_hv_nested.c
> > @@ -32,7 +32,10 @@ void kvmhv_save_hv_regs(struct kvm_vcpu *vcpu, struct hv_guest_state *hr)
> > struct kvmppc_vcore *vc = vcpu->arch.vcore;
> >
> > hr->pcr = vc->pcr | PCR_MASK;
> > - hr->dpdes = vc->dpdes;
> > + if (cpu_has_feature(CPU_FTR_ARCH_300))
> > + hr->dpdes = vcpu->arch.doorbell_request;
> > + else
> > + hr->dpdes = vc->dpdes;
> > hr->hfscr = vcpu->arch.hfscr;
> > hr->tb_offset = vc->tb_offset;
> > hr->dawr0 = vcpu->arch.dawr0;
>
> Great find.
>
> Nested is all POWER9 and later only, so I think you can just
> change to using doorbell_request always.
Noted.
>
> And probably don't have to do anything for book3s_hv.c unless
> I'm mistaken about the feature test.
>
As pointed out above, the intention was to avoid the "else if" part in
kvmhv_run_single_vcpu(). Please do point out if I missed something here.
Thanks,
Gautam
Powered by blists - more mailing lists