[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YqnsAIKxQniAsb1d@google.com>
Date: Wed, 15 Jun 2022 14:26:08 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org,
Lai Jiangshan <jiangshan.ljs@...group.com>
Subject: Re: [PATCH v2 5/8] KVM: x86/mmu: Use separate namespaces for guest
PTEs and shadow PTEs
On Wed, Jun 15, 2022, Paolo Bonzini wrote:
> On 6/15/22 01:33, Sean Christopherson wrote:
> > Separate the macros for KVM's shadow PTEs (SPTE) from guest 64-bit PTEs
> > (PT64). SPTE and PT64 are _mostly_ the same, but the few differences are
> > quite critical, e.g. *_BASE_ADDR_MASK must differentiate between host and
> > guest physical address spaces, and SPTE_PERM_MASK (was PT64_PERM_MASK) is
> > very much specific to SPTEs.
> >
> > Opportunistically (and temporarily) move most guest macros into paging.h
> > to clearly associate them with shadow paging, and to ensure that they're
> > not used as of this commit. A future patch will eliminate them entirely.
> >
> > Sadly, PT32_LEVEL_BITS is left behind in mmu_internal.h because it's
> > needed for the quadrant calculation in kvm_mmu_get_page(). The quadrant
> > calculation is hot enough (when using shadow paging with 32-bit guests)
> > that adding a per-context helper is undesirable, and burying the
> > computation in paging_tmpl.h with a forward declaration isn't exactly an
> > improvement.
> >
> > Signed-off-by: Sean Christopherson <seanjc@...gle.com>
>
> A better try:
Very nice! It's obvious once someone else writes the code. :-)
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 54b3e39d07b3..cd561b49cc84 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2011,8 +2011,21 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
> role.direct = direct;
> role.access = access;
> if (role.has_4_byte_gpte) {
> + /*
> + * The "quadrant" value corresponds to those bits of the address
> + * that have already been used by the 8-byte shadow page table
> + * lookup, but not yet in the 4-byte guest page tables. Having
> + * the quadrant as part of the role ensures that each upper sPTE
> + * points to the the correct portion of the guest page table
> + * structure.
> + *
> + * For example, a 4-byte PDE consumes bits 31:22 and an 8-byte PDE
> + * consumes bits 29:21. Each guest PD must be expanded into four
> + * shadow PDs, one for each value of bits 31:30, and the PDPEs
> + * will use the four quadrants in round-robin fashion.
It's not round-robin, that would imply KVM rotates through each quadrant on its
own. FWIW, I like David's comment from his patch that simplifies this mess in a
similar way.
https://lore.kernel.org/all/20220516232138.1783324-5-dmatlack@google.com
> + */
> quadrant = gaddr >> (PAGE_SHIFT + (SPTE_LEVEL_BITS * level));
> - quadrant &= (1 << ((PT32_LEVEL_BITS - SPTE_LEVEL_BITS) * level)) - 1;
> + quadrant &= (1 << level) - 1;
> role.quadrant = quadrant;
> }
> if (level <= vcpu->arch.mmu->cpu_role.base.level)
> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index cb9d4d358335..5e1e3c8f8aaa 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -20,9 +20,6 @@ extern bool dbg;
> #define MMU_WARN_ON(x) do { } while (0)
> #endif
> -/* The number of bits for 32-bit PTEs is to needed compute the quandrant. */
Heh, and it gets rid of my typo.
> -#define PT32_LEVEL_BITS 10
> -
> /* Page table builder macros common to shadow (host) PTEs and guest PTEs. */
> #define __PT_LEVEL_SHIFT(level, bits_per_level) \
> (PAGE_SHIFT + ((level) - 1) * (bits_per_level))
> diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
> index 6c29aef4092b..e4655056e651 100644
> --- a/arch/x86/kvm/mmu/paging_tmpl.h
> +++ b/arch/x86/kvm/mmu/paging_tmpl.h
> @@ -38,7 +38,7 @@
> #define pt_element_t u32
> #define guest_walker guest_walker32
> #define FNAME(name) paging##32_##name
> - #define PT_LEVEL_BITS PT32_LEVEL_BITS
> + #define PT_LEVEL_BITS 10
> #define PT_MAX_FULL_LEVELS 2
> #define PT_GUEST_DIRTY_SHIFT PT_DIRTY_SHIFT
> #define PT_GUEST_ACCESSED_SHIFT PT_ACCESSED_SHIFT
>
> Paolo
>
Powered by blists - more mailing lists