[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YlR0a4PG5xzweeMZ@google.com>
Date: Mon, 11 Apr 2022 18:33:15 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Mingwei Zhang <mizhang@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/6] KVM: x86/mmu: Properly account NX huge page
workaround for nonpaging MMUs
On Mon, Apr 11, 2022, Mingwei Zhang wrote:
> On Sat, Apr 09, 2022, Sean Christopherson wrote:
> > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> > index 671cfeccf04e..89df062d5921 100644
> > --- a/arch/x86/kvm/mmu.h
> > +++ b/arch/x86/kvm/mmu.h
> > @@ -191,6 +191,15 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> > .user = err & PFERR_USER_MASK,
> > .prefetch = prefetch,
> > .is_tdp = likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault),
> > +
> > + /*
> > + * Note, enforcing the NX huge page mitigation for nonpaging
> > + * MMUs (shadow paging, CR0.PG=0 in the guest) is completely
> > + * unnecessary. The guest doesn't have any page tables to
> > + * abuse and is guaranteed to switch to a different MMU when
> > + * CR0.PG is toggled on (may not always be guaranteed when KVM
> > + * is using TDP). See make_spte() for details.
> > + */
> > .nx_huge_page_workaround_enabled = is_nx_huge_page_enabled(),
>
> hmm. I think there could be a minor issue here (even in original code).
> The nx_huge_page_workaround_enabled is attached here with page fault.
> However, at the time of make_spte(), we call is_nx_huge_page_enabled()
> again. Since this function will directly check the module parameter,
> there might be a race condition here. eg., at the time of page fault,
> the workround was 'true', while by the time we reach make_spte(), the
> parameter was set to 'false'.
Toggling the mitigation invalidates and zaps all roots. Any page fault acquires
mmu_lock after the toggling is guaranteed to see the correct value, any page fault
that completed before kvm_mmu_zap_all_fast() is guaranteed to be zapped.
> I have not figured out what the side effect is. But I feel like the
> make_spte() should just follow the information in kvm_page_fault instead
> of directly querying the global config.
I started down this exact path :-) The problem is that, even without Ben's series,
KVM uses make_spte() for things other than page faults.
Powered by blists - more mailing lists