linux-kernel - Re: [PATCH] KVM: x86/mmu: Do not create SPTEs for GFNs that exceed host.MAXPHYADDR

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Ymv1I5ixX1+k8Nst@google.com>
Date:   Fri, 29 Apr 2022 14:24:35 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Maxim Levitsky <mlevitsk@...hat.com>,
        Ben Gardon <bgardon@...gle.com>,
        David Matlack <dmatlack@...gle.com>
Subject: Re: [PATCH] KVM: x86/mmu: Do not create SPTEs for GFNs that exceed
 host.MAXPHYADDR

On Fri, Apr 29, 2022, Paolo Bonzini wrote:
> On 4/29/22 01:34, Sean Christopherson wrote:
> 
> > +static inline gfn_t kvm_mmu_max_gfn_host(void)
> > +{
> > +	/*
> > +	 * Disallow SPTEs (via memslots or cached MMIO) whose gfn would exceed
> > +	 * host.MAXPHYADDR.  Assuming KVM is running on bare metal, guest
> > +	 * accesses beyond host.MAXPHYADDR will hit a #PF(RSVD) and never hit
> > +	 * an EPT Violation/Misconfig / #NPF, and so KVM will never install a
> > +	 * SPTE for such addresses.  That doesn't hold true if KVM is running
> > +	 * as a VM itself, e.g. if the MAXPHYADDR KVM sees is less than
> > +	 * hardware's real MAXPHYADDR, but since KVM can't honor such behavior
> > +	 * on bare metal, disallow it entirely to simplify e.g. the TDP MMU.
> > +	 */
> > +	return (1ULL << (shadow_phys_bits - PAGE_SHIFT)) - 1;
> 
> The host.MAXPHYADDR however does not matter if EPT/NPT is not in use, because
> the shadow paging fault path can accept any gfn.

... 

> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index e6cae6f22683..dba275d323a7 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -65,6 +65,30 @@ static __always_inline u64 rsvd_bits(int s, int e)
>  	return ((2ULL << (e - s)) - 1) << s;
>  }
> +/*
> + * The number of non-reserved physical address bits irrespective of features
> + * that repurpose legal bits, e.g. MKTME.
> + */
> +extern u8 __read_mostly shadow_phys_bits;
> +
> +static inline gfn_t kvm_mmu_max_gfn(void)
> +{
> +	/*
> +	 * Note that this uses the host MAXPHYADDR, not the guest's.
> +	 * EPT/NPT cannot support GPAs that would exceed host.MAXPHYADDR;
> +	 * assuming KVM is running on bare metal, guest accesses beyond
> +	 * host.MAXPHYADDR will hit a #PF(RSVD) and never cause a vmexit
> +	 * (either EPT Violation/Misconfig or #NPF), and so KVM will never
> +	 * install a SPTE for such addresses.  If KVM is running as a VM
> +	 * itself, on the other hand, it might see a MAXPHYADDR that is less
> +	 * than hardware's real MAXPHYADDR.  Using the host MAXPHYADDR
> +	 * disallows such SPTEs entirely and simplifies the TDP MMU.
> +	 */
> +	int max_gpa_bits = likely(tdp_enabled) ? shadow_phys_bits : 52;

I don't love the divergent memslot behavior, but it's technically correct, so I
can't really argue.  Do we want to "officially" document the memslot behavior?

> +
> +	return (1ULL << (max_gpa_bits - PAGE_SHIFT)) - 1;
> +}
> +
>  void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask);
>  void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only);