lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 15 May 2024 23:14:38 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>, "Huang, Kai"
	<kai.huang@...el.com>, "Yamahata, Isaku" <isaku.yamahata@...el.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"sagis@...gle.com" <sagis@...gle.com>, "isaku.yamahata@...ux.intel.com"
	<isaku.yamahata@...ux.intel.com>, "isaku.yamahata@...il.com"
	<isaku.yamahata@...il.com>, "Aktas, Erdem" <erdemaktas@...gle.com>, "Zhao,
 Yan Y" <yan.y.zhao@...el.com>, "dmatlack@...gle.com" <dmatlack@...gle.com>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "pbonzini@...hat.com"
	<pbonzini@...hat.com>
Subject: Re: [PATCH 08/16] KVM: x86/mmu: Bug the VM if kvm_zap_gfn_range() is
 called for TDX

On Thu, 2024-05-16 at 10:17 +1200, Huang, Kai wrote:
> > TDX has several aspects related to the TDP MMU.
> > 1) Based on the faulting GPA, determine which KVM page table to walk.
> >      (private-vs-shared)
> > 2) Need to call TDX SEAMCALL to operate on Secure-EPT instead of direct
> > memory
> >      load/store.  TDP MMU needs hooks for it.
> > 3) The tables must be zapped from the leaf. not the root or the middle.
> > 
> > For 1) and 2), what about something like this?  TDX backend code will set
> > kvm->arch.has_mirrored_pt = true; I think we will use kvm_gfn_shared_mask()
> > only
> > for address conversion (shared<->private).

1 and 2 are not the same as "mirrored" though. You could have a design that
mirrors half of the EPT and doesn't track it with separate roots. In fact, 1
might be just a KVM design choice, even for TDX.

What we are really trying to do here is not put "is tdx" logic in the generic
code. We could rely on the fact that TDX is the only one with mirrored TDP, but
that is kind of what we are already doing with kvm_gfn_shared_mask().

How about we do helpers for each of your bullets, and they all just check:
vm_type == KVM_X86_TDX_VM

So like:
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index a578ea09dfb3..c0beed5b090a 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -355,4 +355,19 @@ static inline bool kvm_is_private_gpa(const struct kvm
*kvm, gpa_t gpa)
        return mask && !(gpa_to_gfn(gpa) & mask);
 }
 
+static inline bool kvm_has_mirrored_tdp(struct kvm *kvm)
+{
+       return kvm->arch.vm_type == KVM_X86_TDX_VM;
+}
+
+static inline bool kvm_has_private_root(struct kvm *kvm)
+{
+       return kvm->arch.vm_type == KVM_X86_TDX_VM;
+}
+
+static inline bool kvm_zap_leafs_only(struct kvm *kvm)
+{
+       return kvm->arch.vm_type == KVM_X86_TDX_VM;
+}
+
 #endif


This is similar to what Sean proposed earlier, we just didn't get that far:
https://lore.kernel.org/kvm/ZhSYEVCHqSOpVKMh@google.com/


> > 
> > For 1), maybe we can add struct kvm_page_fault.walk_mirrored_pt
> >           (or whatever preferable name)?
> > 
> > For 3), flag of memslot handles it.
> > 
> > ---
> >    arch/x86/include/asm/kvm_host.h | 3 +++
> >    1 file changed, 3 insertions(+)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h
> > b/arch/x86/include/asm/kvm_host.h
> > index aabf1648a56a..218b575d24bd 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1289,6 +1289,7 @@ struct kvm_arch {
> >         u8 vm_type;
> >         bool has_private_mem;
> >         bool has_protected_state;
> > +       bool has_mirrored_pt;
> >         struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
> >         struct list_head active_mmu_pages;
> >         struct list_head zapped_obsolete_pages;
> > @@ -2171,8 +2172,10 @@ void kvm_configure_mmu(bool enable_tdp, int
> > tdp_forced_root_level,
> >    
> >    #ifdef CONFIG_KVM_PRIVATE_MEM
> >    #define kvm_arch_has_private_mem(kvm) ((kvm)->arch.has_private_mem)
> > +#define kvm_arch_has_mirrored_pt(kvm) ((kvm)->arch.has_mirrored_pt)
> >    #else
> >    #define kvm_arch_has_private_mem(kvm) false
> > +#define kvm_arch_has_mirrored_pt(kvm) false
> >    #endif
> >    
> >    static inline u16 kvm_read_ldt(void)
> 
> I think this 'has_mirrored_pt' (or a better name) is better, because it 
> clearly conveys it is for the "page table", but not the actual page that 
> any page table entry maps to.
> 
> AFAICT we need to split the concept of "private page table itself" and 
> the "memory type of the actual GFN".
> 
> E.g., both SEV-SNP and TDX has concept of "private memory" (obviously), 
> but I was told only TDX uses a dedicated private page table which isn't 
> directly accessible for KVV.  SEV-SNP on the other hand just uses normal 
> page table + additional HW managed table to make sure the security.
> 
> In other words, I think we should decide whether to invoke TDP MMU 
> callback for private mapping (the page table itself may just be normal 
> one) depending on the fault->is_private, but not whether the page table 
> is private:
> 
>         if (fault->is_private && kvm_x86_ops->set_private_spte)
>                 kvm_x86_set_private_spte(...);
>         else
>                 tdp_mmu_set_spte_atomic(...);
> 
> And the 'has_mirrored_pt' should be only used to select the root of the 
> page table that we want to operate on.
> 
> This also gives a chance that if there's anything special needs to be 
> done for page allocated for the "non-leaf" middle page table for 
> SEV-SNP, it can just fit.
> 
> Of course, if 'has_mirrored_pt' is true, we can assume there's a special 
> way to operate it, i.e., kvm_x86_ops->set_private_spte etc must be valid.

It's good point that we are mixing up "private" in the code from SNP's
perspective.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ