linux-kernel - [PATCH v3 0/3] KVM: x86: Dynamically allocate hashed page list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250516215422.2550669-1-seanjc@google.com>
Date: Fri, 16 May 2025 14:54:19 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Vipin Sharma <vipinsh@...gle.com>
Subject: [PATCH v3 0/3] KVM: x86: Dynamically allocate hashed page list

Third times a charm!  Right?  Right!?!?

Allocate the hashed list of shadow pages dynamically (separate from
struct kvm), and on-demand.  The hashed list is 32KiB, i.e. absolutely
belongs in a separate allocation, and is worth skipping if KVM isn't
shadowing guest PTEs for the VM.

Side topic #1, a bunch of my measurements from v2 and ealier were "bad",
because I was using a PROVE_LOCKING=y kernel, which significantly inflates
the size of "struct kvm" in particular.

Side topic #2, I have a patch to dynamically allocate the memslots hash
tables (they're very conveniently either 2KiB or 4KiB in size for 64-bit
kernels), but I couldn't convince myself that the complexity is in any way
justified.  I did however account for the size of the hash tables in the
assertions, if only to document where a big chunk of the per-VM memory usage
is going.

Side topic #3, AFAIK, DEBUG_KERNEL=n builds are quite rare, so I'm banking
on build bots tripping the assert (I'll also add a DEBUG_KERNEL=n config to
my own testing, probably).

v3:
 - Add comments explaining the {READ,WRITE}_ONCE logic, and why it's safe
   to set the list outside of mmu_lock. [Vipin]
 - Make the assertions actually work. [Vipin]
 - Refine the assertions so they (hopefully) won't fail on kernels with
   a bunch of debug crud added.

v2:
 - https://lore.kernel.org/all/20250401155714.838398-1-seanjc@google.com
 - Actually defer allocation when using TDP MMU. [Vipin]
 - Free allocation on MMU teardown. [Vipin]

v1: https://lore.kernel.org/all/20250315024010.2360884-1-seanjc@google.com

Sean Christopherson (3):
  KVM: x86/mmu: Dynamically allocate shadow MMU's hashed page list
  KVM: x86: Use kvzalloc() to allocate VM struct
  KVM: x86/mmu: Defer allocation of shadow MMU's hashed page list

 arch/x86/include/asm/kvm_host.h |  6 +--
 arch/x86/kvm/mmu/mmu.c          | 73 ++++++++++++++++++++++++++++++---
 arch/x86/kvm/svm/svm.c          |  2 +
 arch/x86/kvm/vmx/main.c         |  2 +
 arch/x86/kvm/vmx/vmx.c          |  2 +
 arch/x86/kvm/x86.c              |  5 ++-
 arch/x86/kvm/x86.h              | 22 ++++++++++
 7 files changed, 102 insertions(+), 10 deletions(-)

base-commit: 7ef51a41466bc846ad794d505e2e34ff97157f7f
-- 
2.49.0.1112.g889b7c5bd8-goog