lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240906204515.3276696-1-vipinsh@google.com>
Date: Fri,  6 Sep 2024 13:45:13 -0700
From: Vipin Sharma <vipinsh@...gle.com>
To: seanjc@...gle.com, pbonzini@...hat.com, dmatlack@...gle.com
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Vipin Sharma <vipinsh@...gle.com>
Subject: [PATCH v3 0/2]  KVM: x86/mmu: Run NX huge page recovery under MMU
 read lock

Split NX huge page recovery in two separate flows, one for TDP MMU and
one for non-TDP MMU.

TDP MMU flow will use MMU read lock and non-TDP MMU flow will use MMU
write lock. This change unblocks vCPUs which are waiting for MMU read
lock while NX huge page recovery is running and zapping MMU pages.

A Windows guest was showing network latency jitters which was root
caused to vCPUs waiting for MMU read lock when NX huge page recovery
thread was holding MMU write lock. Disabling NX huge page recovery fixed
the jitter issue.

So, to optimize NX huge page recovery, it was modified to run under MMU
read lock, the switch made jitter issue disappear completely and vCPUs
wait time for MMU read lock reduced drastically. Patch 2 commit log has
the data from the tool to show improvement observed.

Patch 1 splits the NX huge pages tracking into two lists, one for TDP
MMU and one for shadow and legacy MMU. Patch 2 adds support to run
recovery worker under MMU read lock for TDP MMU pages.

v3:
- Use pointers in track and untrack NX huge pages APIs for accounting.
- Remove #ifdefs from v2.
- Fix error in v2 where TDP MMU flow was using
  cond_resched_rwlock_write() instead of cond_resched_rwlock_read() 
- Keep common code for both TDP and non-TDP MMU logic.
- Create wrappers for TDP MMU data structures to avoid #ifdefs.

v2: https://lore.kernel.org/kvm/20240829191135.2041489-1-vipinsh@google.com/#t
- Track legacy and TDP MMU NX huge pages separately.
- Each list has their own calculation of "to_zap", i.e. number of pages
  to zap.
- Unaccount huge page before dirty log check and zap logic in TDP MMU recovery
  worker. Check patch 4 for more details.
- 32 bit build issue fix.
- Sparse warning fix for comparing RCU pointer with non-RCU pointer.
  (sp->spt == spte_to_child_pt())


v1: https://lore.kernel.org/kvm/20240812171341.1763297-1-vipinsh@google.com/#t

Vipin Sharma (2):
  KVM: x86/mmu: Track TDP MMU NX huge pages separately
  KVM: x86/mmu: Recover TDP MMU NX huge pages using MMU read lock

 arch/x86/include/asm/kvm_host.h |  13 +++-
 arch/x86/kvm/mmu/mmu.c          | 116 ++++++++++++++++++++++----------
 arch/x86/kvm/mmu/mmu_internal.h |   8 ++-
 arch/x86/kvm/mmu/tdp_mmu.c      |  73 ++++++++++++++++----
 arch/x86/kvm/mmu/tdp_mmu.h      |   6 +-
 5 files changed, 164 insertions(+), 52 deletions(-)


base-commit: 332d2c1d713e232e163386c35a3ba0c1b90df83f
-- 
2.46.0.469.g59c65b2a67-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ