lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 14 May 2024 17:59:44 -0700
From: Rick Edgecombe <rick.p.edgecombe@...el.com>
To: pbonzini@...hat.com,
	seanjc@...gle.com,
	kvm@...r.kernel.org
Cc: linux-kernel@...r.kernel.org,
	isaku.yamahata@...il.com,
	erdemaktas@...gle.com,
	sagis@...gle.com,
	yan.y.zhao@...el.com,
	dmatlack@...gle.com,
	rick.p.edgecombe@...el.com
Subject: [PATCH 08/16] KVM: x86/mmu: Bug the VM if kvm_zap_gfn_range() is called for TDX

When virtualizing some CPU features, KVM uses kvm_zap_gfn_range() to zap
guest mappings so they can be faulted in with different PTE properties.

For TDX private memory this technique is fundamentally not possible.
Remapping private memory requires the guest to "accept" it, and also the
needed PTE properties are not currently supported by TDX for private
memory.

These CPU features are:
1) MTRR update
2) CR0.CD update
3) Non-coherent DMA status update
4) APICV update

Since they cannot be supported, they should be blocked from being
exercised by a TD. In the case of CRO.CD, the feature is fundamentally not
supported for TDX as it cannot see the guest registers. For APICV
inhibit it in future changes.

Guest MTRR support is more of an interesting case. Supported versions of
the TDX module fix the MTRR CPUID bit to 1, but as previously discussed,
it is not possible to fully support the feature. This leaves KVM with a
few options:
 - Support a modified version of the architecture where the caching
   attributes are ignored for private memory.
 - Don't support MTRRs and treat the set MTRR CPUID bit as a TDX Module
   bug.

With the additional consideration that likely guest MTRR support in KVM
will be going away, the later option is the best. Prevent MTRR MSR writes
from calling kvm_zap_gfn_range() in future changes.

Lastly, the most interesting case is non-coherent DMA status updates.
There isn't a way to reject the call. KVM is just notified that there is a
non-coherent DMA device attached, and expected to act accordingly. For
normal VMs today, that means to start respecting guest PAT. However,
recently there has been a proposal to avoid doing this on selfsnoop CPUs
(see link). On such CPUs it should not be problematic to simply always
configure the EPT to honor guest PAT. In future changes TDX can enforce
this behavior for shared memory, resulting in shared memory always
respecting guest PAT for TDX. So kvm_zap_gfn_range() will not need to be
called in this case either.

Unfortunately, this will result in different cache attributes between
private and shared memory, as private memory is always WB and cannot be
changed by the VMM on current TDX modules. But it can't really be helped
while also supporting non-coherent DMA devices.

Since all callers will be prevented from calling kvm_zap_gfn_range() in
future changes, report a bug and terminate the guest if other future
changes to KVM result in triggering kvm_zap_gfn_range() for a TD.

For lack of a better method currently, use kvm_gfn_shared_mask() to
determine if private memory cannot be zapped (as in TDX, the only VM type
that sets it).

Link: https://lore.kernel.org/all/20240309010929.1403984-6-seanjc@google.com/
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@...el.com>
---
TDX MMU Part 1:
 - Remove support from "KVM: x86/tdp_mmu: Zap leafs only for private memory"
 - Add this KVM_BUG_ON() instead
---
 arch/x86/kvm/mmu/mmu.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d5cf5b15a10e..808805b3478d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6528,8 +6528,17 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
 
 	flush = kvm_rmap_zap_gfn_range(kvm, gfn_start, gfn_end);
 
-	if (tdp_mmu_enabled)
+	if (tdp_mmu_enabled) {
+		/*
+		 * kvm_zap_gfn_range() is used when MTRR or PAT memory
+		 * type was changed.  TDX can't handle zapping the private
+		 * mapping, but it's ok because KVM doesn't support either of
+		 * those features for TDX. In case a new caller appears, BUG
+		 * the VM if it's called for solutions with private aliases.
+		 */
+		KVM_BUG_ON(kvm_gfn_shared_mask(kvm), kvm);
 		flush = kvm_tdp_mmu_zap_leafs(kvm, gfn_start, gfn_end, flush);
+	}
 
 	if (flush)
 		kvm_flush_remote_tlbs_range(kvm, gfn_start, gfn_end - gfn_start);
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ