[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANgfPd-kYQgwdtm5uamYrPhq_V6DkocZXTq9iKzbfJaWcLy3Lw@mail.gmail.com>
Date: Mon, 12 Apr 2021 11:21:57 -0700
From: Ben Gardon <bgardon@...gle.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: LKML <linux-kernel@...r.kernel.org>, kvm <kvm@...r.kernel.org>,
Peter Xu <peterx@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Peter Shier <pshier@...gle.com>,
Peter Feiner <pfeiner@...gle.com>,
Junaid Shahid <junaids@...gle.com>,
Jim Mattson <jmattson@...gle.com>,
Yulei Zhang <yulei.kernel@...il.com>,
Wanpeng Li <kernellwp@...il.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Xiao Guangrong <xiaoguangrong.eric@...il.com>
Subject: Re: [PATCH v2 09/13] KVM: x86/mmu: Allow zap gfn range to operate
under the mmu read lock
On Fri, Apr 2, 2021 at 12:53 AM Paolo Bonzini <pbonzini@...hat.com> wrote:
>
> On 02/04/21 01:37, Ben Gardon wrote:
> > +void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
> > + bool shared)
> > {
> > gfn_t max_gfn = 1ULL << (shadow_phys_bits - PAGE_SHIFT);
> >
> > - lockdep_assert_held_write(&kvm->mmu_lock);
> > + kvm_lockdep_assert_mmu_lock_held(kvm, shared);
> >
> > if (!refcount_dec_and_test(&root->tdp_mmu_root_count))
> > return;
> > @@ -81,7 +92,7 @@ void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root)
> > list_del_rcu(&root->link);
> > spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
> >
> > - zap_gfn_range(kvm, root, 0, max_gfn, false, false);
> > + zap_gfn_range(kvm, root, 0, max_gfn, false, false, shared);
> >
> > call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
>
> Instead of patch 13, would it make sense to delay the zap_gfn_range and
> call_rcu to a work item (either unconditionally, or only if
> shared==false)? Then the zap_gfn_range would be able to yield and take
> the mmu_lock for read, similar to kvm_tdp_mmu_zap_invalidated_roots.
>
> If done unconditionally, this would also allow removing the "shared"
> argument to kvm_tdp_mmu_put_root, tdp_mmu_next_root and
> for_each_tdp_mmu_root_yield_safe, so I would place that change before
> this patch.
>
> Paolo
>
I tried that and it created problems. I believe the issue was that on
VM teardown memslots would be freed and the memory reallocated before
the root was torn down, resulting in a use-after free from
mark_pfn_dirty. Perhaps this could be resolved by forcing memslot
changes to wait until that work item was processed before returning. I
can look into it but I suspect there will be a lot of "gotchas"
involved.
Powered by blists - more mailing lists