lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 14 Mar 2013 12:42:09 +0800
From:	Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>
To:	Marcelo Tosatti <mtosatti@...hat.com>
CC:	Gleb Natapov <gleb@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>, KVM <kvm@...r.kernel.org>
Subject: Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages

On 03/14/2013 09:35 AM, Marcelo Tosatti wrote:
> On Wed, Mar 13, 2013 at 10:07:06PM -0300, Marcelo Tosatti wrote:
>> On Wed, Mar 13, 2013 at 12:59:12PM +0800, Xiao Guangrong wrote:
>>> The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
>>> walk and zap all shadow pages one by one, also it need to zap all guest
>>> page's rmap and all shadow page's parent spte list. Particularly, things
>>> become worse if guest uses more memory or vcpus. It is not good for
>>> scalability.
>>>
>>> Since all shadow page will be zapped, we can directly zap the mmu-cache
>>> and rmap so that vcpu will fault on the new mmu-cache, after that, we can
>>> directly free the memory used by old mmu-cache.
>>>
>>> The root shadow page is little especial since they are currently used by
>>> vcpus, we can not directly free them. So, we zap the root shadow pages and
>>> re-add them into the new mmu-cache.
>>>
>>> After this patch, kvm_mmu_zap_all can be faster 113% than before
>>>
>>> Signed-off-by: Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>
>>> ---
>>>  arch/x86/kvm/mmu.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++-----
>>>  1 files changed, 56 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>>> index e326099..536d9ce 100644
>>> --- a/arch/x86/kvm/mmu.c
>>> +++ b/arch/x86/kvm/mmu.c
>>> @@ -4186,18 +4186,68 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
>>>
>>>  void kvm_mmu_zap_all(struct kvm *kvm)
>>>  {
>>> -	struct kvm_mmu_page *sp, *node;
>>> +	LIST_HEAD(root_mmu_pages);
>>>  	LIST_HEAD(invalid_list);
>>> +	struct list_head pte_list_descs;
>>> +	struct kvm_mmu_cache *cache = &kvm->arch.mmu_cache;
>>> +	struct kvm_mmu_page *sp, *node;
>>> +	struct pte_list_desc *desc, *ndesc;
>>> +	int root_sp = 0;
>>>
>>>  	spin_lock(&kvm->mmu_lock);
>>> +
>>>  restart:
>>> -	list_for_each_entry_safe(sp, node,
>>> -	      &kvm->arch.mmu_cache.active_mmu_pages, link)
>>> -		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
>>> -			goto restart;
>>> +	/*
>>> +	 * The root shadow pages are being used on vcpus that can not
>>> +	 * directly removed, we filter them out and re-add them to the
>>> +	 * new mmu cache.
>>> +	 */
>>> +	list_for_each_entry_safe(sp, node, &cache->active_mmu_pages, link)
>>> +		if (sp->root_count) {
>>> +			int ret;
>>> +
>>> +			root_sp++;
>>> +			ret = kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
>>> +			list_move(&sp->link, &root_mmu_pages);
>>> +			if (ret)
>>> +				goto restart;
>>> +		}
>>
>> Why is it safe to skip flushing of root pages, for all
>> kvm_flush_shadow() callers?
> 
> You are not skipping the flush, only moving to the new mmu cache.
> 
>> Should revisit KVM_REQ_MMU_RELOAD... not clear it is necessary for NPT
>> (unrelated).
> 
> Actually, what i meant is: you can batch KVM_REQ_MMU_RELOAD requests to
> the end of kvm_mmu_zap_all. Waking up vcpus is not optimal since they're
> going to contend for mmu_lock anyway.

Yes, I agree. Will move KVM_REQ_MMU_RELOAD to the end of kvm_mmu_zap_all in
the V2.

BTW, the TLB flushed is not needed if no root shadow page zapped since all
vcpus are not using shadow pages. The code may be simplified to (after batch
KVM_REQ_MMU_RELOAD):

if (root_sp)
	kvm_reload_remote_mmus()
> 
> Need more time to have more useful comments to this patchset, sorry.

No problem. ;) The current comments is really useful for me.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ