[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZTiYp8nMGImoWzBZ@yilunxu-OptiPlex-7050>
Date: Wed, 25 Oct 2023 12:25:11 +0800
From: Xu Yilun <yilun.xu@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Al Viro <viro@...iv.linux.org.uk>,
David Matlack <dmatlack@...gle.com>
Subject: Re: [PATCH 2/3] KVM: Always flush async #PF workqueue when vCPU is
being destroyed
On Tue, Oct 24, 2023 at 08:49:24AM -0700, Sean Christopherson wrote:
> On Tue, Oct 24, 2023, Xu Yilun wrote:
> > On Wed, Oct 18, 2023 at 01:46:23PM -0700, Sean Christopherson wrote:
> > > @@ -126,7 +124,19 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
> > > list_first_entry(&vcpu->async_pf.done,
> > > typeof(*work), link);
> > > list_del(&work->link);
> > > +
> > > + spin_unlock(&vcpu->async_pf.lock);
> > > +
> > > + /*
> > > + * The async #PF is "done", but KVM must wait for the work item
> > > + * itself, i.e. async_pf_execute(), to run to completion. If
> > > + * KVM is a module, KVM must ensure *no* code owned by the KVM
> > > + * (the module) can be run after the last call to module_put(),
> > > + * i.e. after the last reference to the last vCPU's file is put.
> > > + */
> > > + flush_work(&work->work);
> >
> > I see the flush_work() is inside the check:
> >
> > while (!list_empty(&vcpu->async_pf.done))
> >
> > Is it possible all async_pf are already completed but the work item,
> > i.e. async_pf_execute, is not completed before this check? That the
> > work is scheduled out after kvm_arch_async_page_present_queued() and
> > all APF_READY requests have been handled. In this case the work
> > synchronization will be skipped...
>
> Good gravy. Yes, I assumed KVM wouldn't be so crazy to delete the work before it
> completed, but I obviously didn't see this comment in async_pf_execute():
>
> /*
> * apf may be freed by kvm_check_async_pf_completion() after
> * this point
> */
>
> The most straightforward fix I see is to also flush the work in
> kvm_check_async_pf_completion(), and then delete the comment. The downside is
> that there's a small chance a vCPU could be delayed waiting for the work to
> complete, but that's a very, very small chance, and likely a very small delay.
> kvm_arch_async_page_present_queued() unconditionaly makes a new request, i.e. will
> effectively delay entering the guest, so the remaining work is really just:
>
> trace_kvm_async_pf_completed(addr, cr2_or_gpa);
>
> __kvm_vcpu_wake_up(vcpu);
>
> mmput(mm);
>
> Since mmput() can't drop the last reference to the page tables if the vCPU is
> still alive.
OK, seems the impact is minor. I'm good to it.
Thanks,
Yilun
>
> I think I'll spin off the async #PF fix to a separate series. There's are other
> tangetially related cleanups that can be done, e.g. there's no reason to pin the
> page tables while work is queued, async_pf_execute() can do mmget_not_zero() and
> then bail if the process is dying. Then there's no need to do mmput() when
> canceling work.
Powered by blists - more mailing lists