[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZvJ6N3oRBjlUf+l4@yzhao56-desk.sh.intel.com>
Date: Tue, 24 Sep 2024 16:37:11 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: <pbonzini@...hat.com>, <rick.p.edgecombe@...el.com>,
<kai.huang@...el.com>, <isaku.yamahata@...el.com>, <dmatlack@...gle.com>,
<sagis@...gle.com>, <erdemaktas@...gle.com>, <graf@...zon.com>,
<linux-kernel@...r.kernel.org>, <kvm@...r.kernel.org>
Subject: Re: [PATCH v2 1/4] KVM: x86/mmu: Introduce a quirk to control
memslot zap behavior
On Tue, Sep 24, 2024 at 12:45:52AM -0700, Sean Christopherson wrote:
> On Tue, Sep 24, 2024, Yan Zhao wrote:
> > On Mon, Sep 23, 2024 at 11:37:14AM -0700, Sean Christopherson wrote:
> > > > +static void kvm_mmu_zap_memslot_leafs(struct kvm *kvm, struct kvm_memory_slot *slot)
> > > > +{
> > > > + struct kvm_gfn_range range = {
> > > > + .slot = slot,
> > > > + .start = slot->base_gfn,
> > > > + .end = slot->base_gfn + slot->npages,
> > > > + .may_block = true,
> > > > + };
> > > > + bool flush = false;
> > > > +
> > > > + write_lock(&kvm->mmu_lock);
> > > > +
> > > > + if (kvm_memslots_have_rmaps(kvm))
> > > > + flush = kvm_handle_gfn_range(kvm, &range, kvm_zap_rmap);
> > >
> > > This, and Paolo's merged variant, break shadow paging. As was tried in commit
> > > 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot"),
> > > all shadow pages, i.e. non-leaf SPTEs, need to be zapped. All of the accounting
> > > for a shadow page is tied to the memslot, i.e. the shadow page holds a reference
> > > to the memslot, for all intents and purposes. Deleting the memslot without removing
> > > all relevant shadow pages results in NULL pointer derefs when tearing down the VM.
> > >
> > > Note, that commit is/was buggy, and I suspect my follow-up attempt[*] was as well.
> > > https://lore.kernel.org/all/20190820200318.GA15808@linux.intel.com
> > >
> > > Rather than trying to get this functional for shadow paging (which includes nested
> > > TDP), I think we should scrap the quirk idea and simply make this the behavior for
> > > S-EPT and nothing else.
> > Ok. Thanks for identifying this error. Will change code to this way.
>
> For now, I think a full revert of the entire series makes sense. Irrespective of
> this bug, I don't think KVM should commit to specific implementation behavior,
> i.e. KVM shouldn't explicitly say only leaf SPTEs are zapped. The quirk docs
> should instead say that if the quirk is disabled, KVM will only guarantee that
> the delete memslot will be inaccessible. That way, KVM can still do a fast zap
> when it makes sense, e.g. for large memslots, do a complete fast zap, and for
> small memslots, do a targeted zap of the TDP MMU but a fast zap of any shadow
> MMUs.
For TDX, could we do as below after the full revert of this series?
void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot)
{
kvm_mmu_zap_all_fast(kvm); ==> this will skip mirror root
kvm_mmu_zap_memslot_mirror_leafs(kvm, slot); ==> zap memslot leaf entries
in mirror root
}
>
> > BTW: update some findings regarding to the previous bug with Nvidia GPU
> > assignment:
> > I found that after v5.19-rc1+, even with nx_huge_pages=N, the bug is not
> > reproducible when only leaf entries of memslot are zapped.
> > (no more detailed info due to limited time to debug).
>
> Heh, I've given up hope on ever finding a root cause for that issue.
Powered by blists - more mailing lists