[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <12e0b8b85930f158b35134d54d321ea6fc403584.camel@redhat.com>
Date: Thu, 09 Dec 2021 17:35:35 +0200
From: Maxim Levitsky <mlevitsk@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
"open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
<linux-kernel@...r.kernel.org>, Wanpeng Li <wanpengli@...cent.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Joerg Roedel <joro@...tes.org>,
"H. Peter Anvin" <hpa@...or.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Borislav Petkov <bp@...en8.de>,
"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Jim Mattson <jmattson@...gle.com>
Subject: Re: [PATCH 3/6] KVM: SVM: fix AVIC race of host->guest IPI delivery
vs AVIC inhibition
On Thu, 2021-12-09 at 17:33 +0200, Maxim Levitsky wrote:
> On Thu, 2021-12-09 at 15:27 +0000, Sean Christopherson wrote:
> > On Thu, Dec 09, 2021, Maxim Levitsky wrote:
> > > On Thu, 2021-12-09 at 15:11 +0100, Paolo Bonzini wrote:
> > > > On 12/9/21 12:54, Maxim Levitsky wrote:
> > > > > If svm_deliver_avic_intr is called just after the target vcpu's AVIC got
> > > > > inhibited, it might read a stale value of vcpu->arch.apicv_active
> > > > > which can lead to the target vCPU not noticing the interrupt.
> > > > >
> > > > > Signed-off-by: Maxim Levitsky <mlevitsk@...hat.com>
> > > > > ---
> > > > > arch/x86/kvm/svm/avic.c | 16 +++++++++++++---
> > > > > 1 file changed, 13 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > > > > index 859ad2dc50f1..8c1b934bfa9b 100644
> > > > > --- a/arch/x86/kvm/svm/avic.c
> > > > > +++ b/arch/x86/kvm/svm/avic.c
> > > > > @@ -691,6 +691,15 @@ int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
> > > > > * automatically process AVIC interrupts at VMRUN.
> > > > > */
> > > > > if (vcpu->mode == IN_GUEST_MODE) {
> > > > > +
> > > > > + /*
> > > > > + * At this point we had read the vcpu->arch.apicv_active == true
> > > > > + * and the vcpu->mode == IN_GUEST_MODE.
> > > > > + * Since we have a memory barrier after setting IN_GUEST_MODE,
> > > > > + * it ensures that AVIC inhibition is complete and thus
> > > > > + * the target is really running with AVIC enabled.
> > > > > + */
> > > > > +
> > > > > int cpu = READ_ONCE(vcpu->cpu);
> > > >
> > > > I don't think it's correct. The vCPU has apicv_active written (in
> > > > kvm_vcpu_update_apicv) before vcpu->mode.
> > >
> > > I thought that we have a full memory barrier just prior to setting IN_GUEST_MODE
> > > thus if I see vcpu->mode == IN_GUEST_MODE then I'll see correct apicv_active value.
> > > But apparently the memory barrier is after setting vcpu->mode.
> > >
> > >
> > > > For the acquire/release pair to work properly you need to 1) read
> > > > apicv_active *after* vcpu->mode here 2) use store_release and
> > > > load_acquire for vcpu->mode, respectively in vcpu_enter_guest and here.
> > >
> > > store_release for vcpu->mode in vcpu_enter_guest means a write barrier just before setting it,
> > > which I expected to be there.
> > >
> > > And yes I see now, I need a read barrier here as well. I am still learning this.
> >
> > Sans barriers and comments, can't this be written as returning an "error" if the
> > vCPU is not IN_GUEST_MODE? Effectively the same thing, but a little more precise
> > and it avoids duplicating the lapic.c code.
>
> Yes, beside the fact that we already set the vIRR bit so if I return -1 here, it will be set again..
> (and these are set using atomic ops)
>
> I don't know how much that matters except the fact that while a vCPU runs a nested guest,
> callers wishing to send IPI to it, will go through this code path a lot
> (even when I implement nested AVIC as it is a separate thing which is used by L2 only).
Ah, hit send too soon, makes sense now to me!
Best regards,
Maxim Levitsky
>
> Best regards,
> Maxim Levitsky
>
> > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > index 26ed5325c593..cddf7a8da3ea 100644
> > --- a/arch/x86/kvm/svm/avic.c
> > +++ b/arch/x86/kvm/svm/avic.c
> > @@ -671,7 +671,7 @@ void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
> >
> > int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
> > {
> > - if (!vcpu->arch.apicv_active)
> > + if (vcpu->mode != IN_GUEST_MODE || !vcpu->arch.apicv_active)
> > return -1;
> >
> > kvm_lapic_set_irr(vec, vcpu->arch.apic);
> > @@ -706,8 +706,9 @@ int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
> > put_cpu();
> > } else {
> > /*
> > - * Wake the vCPU if it was blocking. KVM will then detect the
> > - * pending IRQ when checking if the vCPU has a wake event.
> > + * Wake the vCPU if it is blocking. If the vCPU exited the
> > + * guest since the previous vcpu->mode check, it's guaranteed
> > + * to see the event before re-enterring the guest.
> > */
> > kvm_vcpu_wake_up(vcpu);
> > }
> >
Powered by blists - more mailing lists