[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c9b893a643cea2ffd4324c25b9d169f920db1ad4.camel@redhat.com>
Date: Thu, 30 Oct 2025 15:57:37 -0400
From: mlevitsk@...hat.com
To: Sean Christopherson <seanjc@...gle.com>, Paolo Bonzini
<pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, Dave Hansen <dave.hansen@...ux.intel.com>, "H.
Peter Anvin"
<hpa@...or.com>, Ingo Molnar <mingo@...hat.com>, Thomas Gleixner
<tglx@...utronix.de>, x86@...nel.org, Borislav Petkov <bp@...en8.de>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/3] KVM: x86: Fix a semi theoretical bug in
kvm_arch_async_page_present_queued
On Mon, 2025-10-27 at 08:00 -0700, Sean Christopherson wrote:
> On Tue, Sep 23, 2025, Sean Christopherson wrote:
> > On Tue, Sep 23, 2025, Paolo Bonzini wrote:
> > > On 9/23/25 20:55, Sean Christopherson wrote:
> > > > On Tue, Sep 23, 2025, Paolo Bonzini wrote:
> > > > > On 8/13/25 21:23, Maxim Levitsky wrote:
> > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > > index 9018d56b4b0a..3d45a4cd08a4 100644
> > > > > > --- a/arch/x86/kvm/x86.c
> > > > > > +++ b/arch/x86/kvm/x86.c
> > > > > > @@ -13459,9 +13459,14 @@ void kvm_arch_async_page_present(struct kvm_vcpu *vcpu,
> > > > > > void kvm_arch_async_page_present_queued(struct kvm_vcpu *vcpu)
> > > > > > {
> > > > > > - kvm_make_request(KVM_REQ_APF_READY, vcpu);
> > > > > > - if (!vcpu->arch.apf.pageready_pending)
> > > > > > + /* Pairs with smp_store_release in vcpu_enter_guest. */
> > > > > > + bool in_guest_mode = (smp_load_acquire(&vcpu->mode) == IN_GUEST_MODE);
> > > > > > + bool page_ready_pending = READ_ONCE(vcpu->arch.apf.pageready_pending);
> > > > > > +
> > > > > > + if (!in_guest_mode || !page_ready_pending) {
> > > > > > + kvm_make_request(KVM_REQ_APF_READY, vcpu);
> > > > > > kvm_vcpu_kick(vcpu);
> > > > > > + }
> > > > >
> > > > > Unlike Sean, I think the race exists in abstract and is not benign
> > > >
> > > > How is it not benign? I never said the race doesn't exist, I said that consuming
> > > > a stale vcpu->arch.apf.pageready_pending in kvm_arch_async_page_present_queued()
> > > > is benign.
> > >
> > > In principle there is a possibility that a KVM_REQ_APF_READY is missed.
> >
> > I think you mean a kick (wakeup or IPI), is missed, not that the APF_READY itself
> > is missed. I.e. KVM_REQ_APF_READY will never be lost, KVM just might enter the
> > guest or schedule out the vCPU with the flag set.
> >
> > All in all, I think we're in violent agreement. I agree that kvm_vcpu_kick()
> > could be missed (theoretically), but I'm saying that missing the kick would be
> > benign due to a myriad of other barriers and checks, i.e. that the vCPU is
> > guaranteed to see KVM_REQ_APF_READY anyways.
> >
> > E.g. my suggestion earlier regarding OUTSIDE_GUEST_MODE was to rely on the
> > smp_mb__after_srcu_read_{,un}lock() barriers in vcpu_enter_guest() to ensure
> > KVM_REQ_APF_READY would be observed before trying VM-Enter, and that if KVM might
> > be in the process of emulating HLT (blocking), that either KVM_REQ_APF_READY is
> > visible to the vCPU or that kvm_arch_async_page_present() wakes the vCPU. Oh,
> > hilarious, async_pf_execute() also does an unconditional __kvm_vcpu_wake_up().
> >
> > Huh. But isn't that a real bug? KVM doesn't consider KVM_REQ_APF_READY to be a
> > wake event, so isn't this an actual race?
> >
> > vCPU async #PF
> > kvm_check_async_pf_completion()
> > pageready_pending = false
> > VM-Enter
> > HLT
> > VM-Exit
> > kvm_make_request(KVM_REQ_APF_READY, vcpu)
> > kvm_vcpu_kick(vcpu) // nop as the vCPU isn't blocking, yet
> > __kvm_vcpu_wake_up() // nop for the same reason
> > vcpu_block()
> > <hang>
> >
> > On x86, the "page ready" IRQ is only injected from vCPU context, so AFAICT nothing
> > is guarnateed wake the vCPU in the above sequence.
>
> Gah, KVM checks async_pf.done instead of the request. So I don't think there's
> a bug, just weird code.
Hi!
Note that I posted a v2 of this patch series. Do I need to drop this patch or its better to keep it
(the patch should still be correct, but maybe an overkill I think).
What do you think?
Can we have the patch 3 of the v2 merged as it fixes an real issue, which actually
causes random and hard to debug failures?
Best regards,
Maxim Levitsky
>
> bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)
> {
> if (!list_empty_careful(&vcpu->async_pf.done)) <===
> return true;
>
Powered by blists - more mailing lists