linux-kernel - [PATCH v4 0/6] KVM: MMU: fast zap all shadow pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date:	Sat, 27 Apr 2013 11:13:16 +0800
From:	Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>
To:	mtosatti@...hat.com
Cc:	gleb@...hat.com, avi.kivity@...il.com,
	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>
Subject: [PATCH v4 0/6] KVM: MMU: fast zap all shadow pages

This patchset is based on current 'queue' branch on kvm tree.

Changlog:

V4:
  1): drop unmapping invalid rmap out of mmu-lock and use lock-break technique
      instead. Thanks to Gleb's comments.

  2): needn't handle invalid-gen pages specially due to page table always
      switched by KVM_REQ_MMU_RELOAD. Thanks to Marcelo's comments.

V3:
  completely redesign the algorithm, please see below.

V2:
  - do not reset n_requested_mmu_pages and n_max_mmu_pages
  - batch free root shadow pages to reduce vcpu notification and mmu-lock
    contention
  - remove the first patch that introduce kvm->arch.mmu_cache since we only
    'memset zero' on hashtable rather than all mmu cache members in this
    version
  - remove unnecessary kvm_reload_remote_mmus after kvm_mmu_zap_all

* Issue
The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
walk and zap all shadow pages one by one, also it need to zap all guest
page's rmap and all shadow page's parent spte list. Particularly, things
become worse if guest uses more memory or vcpus. It is not good for
scalability.

* Idea
KVM maintains a global mmu invalid generation-number which is stored in
kvm->arch.mmu_valid_gen and every shadow page stores the current global
generation-number into sp->mmu_valid_gen when it is created.

When KVM need zap all shadow pages sptes, it just simply increase the
global generation-number then reload root shadow pages on all vcpus.
Vcpu will create a new shadow page table according to current kvm's
generation-number. It ensures the old pages are not used any more.

The invalid-gen pages (sp->mmu_valid_gen != kvm->arch.mmu_valid_gen)
are keeped in mmu-cache until page allocator reclaims page.

* Challenges
Some page invalidation is requested when memslot is moved or deleted
and kvm is being destroy who call zap_all_pages to delete all sp using
their rmap and lpage-info, after call zap_all_pages, the rmap and lpage-info
will be freed.

For the lpage-info, we clear all lpage count when do zap-all-pages, then
all invalid shadow pages are not counted in lpage-info, after that lpage-info
on the invalid memslot can be safely freed. This is also good for the
performance - it allows guest to use hugepage as far as possible.

For the rmap, we use lock-break technique to zap all sptes linked on the
invalid rmap, it is not very effective but good for the first step.

* TODO
Unmapping invalid rmap out of mmu-lock with a clear way.

Xiao Guangrong (6):
  KVM: MMU: drop unnecessary kvm_reload_remote_mmus
  KVM: x86: introduce memslot_set_lpage_disallowed
  KVM: MMU: introduce kvm_clear_all_lpage_info
  KVM: MMU: fast invalid all shadow pages
  KVM: x86: use the fast way to invalid all pages
  KVM: MMU: make kvm_mmu_zap_all preemptable

 arch/x86/include/asm/kvm_host.h |    2 +
 arch/x86/kvm/mmu.c              |   88 ++++++++++++++++++++++++++++++++++++++-
 arch/x86/kvm/mmu.h              |    2 +
 arch/x86/kvm/x86.c              |   87 ++++++++++++++++++++++++++++-----------
 arch/x86/kvm/x86.h              |    2 +
 5 files changed, 155 insertions(+), 26 deletions(-)

-- 
1.7.7.6

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/