lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 13 Apr 2023 13:58:32 -0700
From:   Sean Christopherson <seanjc@...gle.com>
To:     David Matlack <dmatlack@...gle.com>
Cc:     Jeremi Piotrowski <jpiotrowski@...ux.microsoft.com>,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Tianyu Lan <ltykernel@...il.com>,
        Michael Kelley <mikelley@...rosoft.com>
Subject: Re: [PATCH] KVM: SVM: Disable TDP MMU when running on Hyper-V

On Thu, Apr 13, 2023, David Matlack wrote:
> On Thu, Apr 13, 2023 at 12:10 PM Sean Christopherson <seanjc@...gle.com> wrote:
> >
> > On Thu, Apr 13, 2023, Sean Christopherson wrote:
> > > Aha!  Idea.  There are _at most_ 4 possible roots the TDP MMU can encounter.
> > > 4-level non-SMM, 4-level SMM, 5-level non-SMM, and 5-level SMM.  I.e. not keeping
> > > inactive roots on a per-VM basis is just monumentally stupid.
> >
> > One correction: there are 6 possible roots:
> >
> >   1. 4-level !SMM !guest_mode (i.e. not nested)
> >   2. 4-level SMM !guest_mode
> >   3. 5-level !SMM !guest_mode
> >   4. 5-level SMM !guest_mode
> >   5. 4-level !SMM guest_mode
> >   6. 5-level !SMM guest_mode
> >
> > I forgot that KVM still uses the TDP MMU when running L2 if L1 doesn't enable
> > EPT/TDP, i.e. if L1 is using shadow paging for L2.  But that really doesn't change
> > anything as each vCPU can already track 4 roots, i.e. userspace can saturate all
> > 6 roots anyways.  And in practice, no sane VMM will create a VM with both 4-level
> > and 5-level roots (KVM keys off of guest.MAXPHYADDR for the TDP root level).
> 
> Why do we create a new root for guest_mode=1 if L1 disables EPT/NPT?

Because "private", a.k.a. KVM-internal, memslots are visible to L1 but not L2.
Which for TDP means the APIC-access page.  From commit 3a2936dedd20:

    kvm: mmu: Don't expose private memslots to L2
    
    These private pages have special purposes in the virtualization of L1,
    but not in the virtualization of L2. In particular, L1's APIC access
    page should never be entered into L2's page tables, because this
    causes a great deal of confusion when the APIC virtualization hardware
    is being used to accelerate L2's accesses to its own APIC.

FWIW, I _think_ KVM could actually let L2 access the APIC-access page when L1 is
running without any APIC virtualization, i.e. when L1 is passing its APIC through
to L2.  E.g. something like the below, but I ain't touching that with a 10 foot pole
unless someone explicitly asks for it :-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 039fb16560a0..8aa12f5f2c30 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4370,10 +4370,13 @@ static int __kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault
        if (!kvm_is_visible_memslot(slot)) {
                /* Don't expose private memslots to L2. */
                if (is_guest_mode(vcpu)) {
-                       fault->slot = NULL;
-                       fault->pfn = KVM_PFN_NOSLOT;
-                       fault->map_writable = false;
-                       return RET_PF_CONTINUE;
+                       if (!slot || slot->id != APIC_ACCESS_PAGE_PRIVATE_MEMSLOT ||
+                           nested_cpu_has_virtual_apic(vcpu)) {
+                               fault->slot = NULL;
+                               fault->pfn = KVM_PFN_NOSLOT;
+                               fault->map_writable = false;
+                               return RET_PF_CONTINUE;
+                           }
                }
                /*
                 * If the APIC access page exists but is disabled, go directly




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ