lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3fa97619ca852854408eec8bb2f7a0ed635b3c1d.camel@intel.com>
Date: Tue, 28 May 2024 18:29:59 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "pbonzini@...hat.com" <pbonzini@...hat.com>, "Yamahata, Isaku"
	<isaku.yamahata@...el.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"seanjc@...gle.com" <seanjc@...gle.com>, "Huang, Kai" <kai.huang@...el.com>,
	"sagis@...gle.com" <sagis@...gle.com>, "isaku.yamahata@...ux.intel.com"
	<isaku.yamahata@...ux.intel.com>, "Aktas, Erdem" <erdemaktas@...gle.com>,
	"Zhao, Yan Y" <yan.y.zhao@...el.com>, "kvm@...r.kernel.org"
	<kvm@...r.kernel.org>, "dmatlack@...gle.com" <dmatlack@...gle.com>,
	"isaku.yamahata@...il.com" <isaku.yamahata@...il.com>
Subject: Re: [PATCH 10/16] KVM: x86/tdp_mmu: Support TDX private mapping for
 TDP MMU

On Tue, 2024-05-28 at 19:16 +0200, Paolo Bonzini wrote:
> > > After this, gfn_t's never have shared bit. It's a simple rule. The MMU
> > > mostly
> > > thinks it's operating on a shared root that is mapped at the normal GFN.
> > > Only
> > > the iterator knows that the shared PTEs are actually in a different
> > > location.
> > > 
> > > There are some negative side effects:
> > > 1. The struct kvm_mmu_page's gfn doesn't match it's actual mapping
> > > anymore.
> > > 2. As a result of above, the code that flushes TLBs for a specific GFN
> > > will be
> > > confused. It won't functionally matter for TDX, just look buggy to see
> > > flushing
> > > code called with the wrong gfn.
> > 
> > flush_remote_tlbs_range() is only for Hyper-V optimization.  In other cases,
> > x86_op.flush_remote_tlbs_range = NULL or the member isn't defined at compile
> > time.  So the remote tlb flush falls back to flushing whole range.  I don't
> > expect TDX in hyper-V guest.  I have to admit that the code looks
> > superficially
> > broken and confusing.
> 
> You could add an "&& kvm_has_private_root(kvm)" to
> kvm_available_flush_remote_tlbs_range(), since
> kvm_has_private_root(kvm) is sort of equivalent to "there is no 1:1
> correspondence between gfn and PTE to be flushed".
> 
> I am conflicted myself, but the upsides below are pretty substantial.

It looks like kvm_available_flush_remote_tlbs_range() is not checked in many of
the paths that get to x86_ops.flush_remote_tlbs_range().

So maybe something like:
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 65bbda95acbb..e09bb6c50a0b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1959,14 +1959,7 @@ static inline int kvm_arch_flush_remote_tlbs(struct kvm
*kvm)
 
 #if IS_ENABLED(CONFIG_HYPERV)
 #define __KVM_HAVE_ARCH_FLUSH_REMOTE_TLBS_RANGE
-static inline int kvm_arch_flush_remote_tlbs_range(struct kvm *kvm, gfn_t gfn,
-                                                  u64 nr_pages)
-{
-       if (!kvm_x86_ops.flush_remote_tlbs_range)
-               return -EOPNOTSUPP;
-
-       return static_call(kvm_x86_flush_remote_tlbs_range)(kvm, gfn, nr_pages);
-}
+int kvm_arch_flush_remote_tlbs_range(struct kvm *kvm, gfn_t gfn, u64 nr_pages);
 #endif /* CONFIG_HYPERV */
 
 enum kvm_intr_type {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 43d70f4c433d..9dc1b3db286d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -14048,6 +14048,14 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu,
unsigned int size,
 }
 EXPORT_SYMBOL_GPL(kvm_sev_es_string_io);
 
+int kvm_arch_flush_remote_tlbs_range(struct kvm *kvm, gfn_t gfn, u64 nr_pages)
+{
+       if (!kvm_x86_ops.flush_remote_tlbs_range || kvm_gfn_direct_mask(kvm))
+               return -EOPNOTSUPP;
+
+       return static_call(kvm_x86_flush_remote_tlbs_range)(kvm, gfn, nr_pages);
+}
+
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_entry);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_mmio);


Regarding the kvm_gfn_direct_mask() usage, in the current WIP code we have
renamed things around the concepts of "mirrored roots" and "direct masks". The
mirrored root, just means "also go off an update something else" (S-EPT). The
direct mask, just means when on the direct root, shift the actual page table
mapping using the mask (shared memory). Kai raised that all TDX special stuff in
the x86 MMU around handling private memory is confusing from the SEV
perspective, so we were trying to rename those things to something related, but
generic instead of "private".

So the TLB flush confusion is more about that the direct GFNs are shifted by
something (i.e. kvm_gfn_direct_mask() returns non-zero).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ