linux-kernel - Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130314013525.GA11710@amt.cnet>
Date:	Wed, 13 Mar 2013 22:35:25 -0300
From:	Marcelo Tosatti <mtosatti@...hat.com>
To:	Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>
Cc:	Gleb Natapov <gleb@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>, KVM <kvm@...r.kernel.org>
Subject: Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages

On Wed, Mar 13, 2013 at 10:07:06PM -0300, Marcelo Tosatti wrote:
> On Wed, Mar 13, 2013 at 12:59:12PM +0800, Xiao Guangrong wrote:
> > The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
> > walk and zap all shadow pages one by one, also it need to zap all guest
> > page's rmap and all shadow page's parent spte list. Particularly, things
> > become worse if guest uses more memory or vcpus. It is not good for
> > scalability.
> > 
> > Since all shadow page will be zapped, we can directly zap the mmu-cache
> > and rmap so that vcpu will fault on the new mmu-cache, after that, we can
> > directly free the memory used by old mmu-cache.
> > 
> > The root shadow page is little especial since they are currently used by
> > vcpus, we can not directly free them. So, we zap the root shadow pages and
> > re-add them into the new mmu-cache.
> > 
> > After this patch, kvm_mmu_zap_all can be faster 113% than before
> > 
> > Signed-off-by: Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>
> > ---
> >  arch/x86/kvm/mmu.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++-----
> >  1 files changed, 56 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > index e326099..536d9ce 100644
> > --- a/arch/x86/kvm/mmu.c
> > +++ b/arch/x86/kvm/mmu.c
> > @@ -4186,18 +4186,68 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
> > 
> >  void kvm_mmu_zap_all(struct kvm *kvm)
> >  {
> > -	struct kvm_mmu_page *sp, *node;
> > +	LIST_HEAD(root_mmu_pages);
> >  	LIST_HEAD(invalid_list);
> > +	struct list_head pte_list_descs;
> > +	struct kvm_mmu_cache *cache = &kvm->arch.mmu_cache;
> > +	struct kvm_mmu_page *sp, *node;
> > +	struct pte_list_desc *desc, *ndesc;
> > +	int root_sp = 0;
> > 
> >  	spin_lock(&kvm->mmu_lock);
> > +
> >  restart:
> > -	list_for_each_entry_safe(sp, node,
> > -	      &kvm->arch.mmu_cache.active_mmu_pages, link)
> > -		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
> > -			goto restart;
> > +	/*
> > +	 * The root shadow pages are being used on vcpus that can not
> > +	 * directly removed, we filter them out and re-add them to the
> > +	 * new mmu cache.
> > +	 */
> > +	list_for_each_entry_safe(sp, node, &cache->active_mmu_pages, link)
> > +		if (sp->root_count) {
> > +			int ret;
> > +
> > +			root_sp++;
> > +			ret = kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
> > +			list_move(&sp->link, &root_mmu_pages);
> > +			if (ret)
> > +				goto restart;
> > +		}
> 
> Why is it safe to skip flushing of root pages, for all
> kvm_flush_shadow() callers?

You are not skipping the flush, only moving to the new mmu cache.

> Should revisit KVM_REQ_MMU_RELOAD... not clear it is necessary for NPT
> (unrelated).

Actually, what i meant is: you can batch KVM_REQ_MMU_RELOAD requests to
the end of kvm_mmu_zap_all. Waking up vcpus is not optimal since they're
going to contend for mmu_lock anyway.

Need more time to have more useful comments to this patchset, sorry.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/