[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240815083039.737qa2bxhwxxhf32@yy-desk-7060>
Date: Thu, 15 Aug 2024 16:30:39 +0800
From: Yuan Yao <yuan.yao@...ux.intel.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Peter Gonda <pgonda@...gle.com>,
Michael Roth <michael.roth@....com>,
Vishal Annapurve <vannapurve@...gle.com>,
Ackerly Tng <ackerleytng@...gle.com>
Subject: Re: [PATCH 03/22] KVM: x86/mmu: Trigger unprotect logic only on
write-protection page faults
On Wed, Aug 14, 2024 at 07:21:06AM -0700, Sean Christopherson wrote:
> On Wed, Aug 14, 2024, Yuan Yao wrote:
> > > @@ -5960,6 +5961,41 @@ void kvm_mmu_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
> > > write_unlock(&vcpu->kvm->mmu_lock);
> > > }
> > >
> > > +static int kvm_mmu_write_protect_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> > > + u64 error_code, int *emulation_type)
> > > +{
> > > + bool direct = vcpu->arch.mmu->root_role.direct;
> > > +
> > > + /*
> > > + * Before emulating the instruction, check if the error code
> > > + * was due to a RO violation while translating the guest page.
> > > + * This can occur when using nested virtualization with nested
> > > + * paging in both guests. If true, we simply unprotect the page
> > > + * and resume the guest.
> > > + */
> > > + if (direct &&
> > > + (error_code & PFERR_NESTED_GUEST_PAGE) == PFERR_NESTED_GUEST_PAGE) {
> > > + kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(cr2_or_gpa));
> > > + return RET_PF_FIXED;
> > > + }
> > > +
> > > + /*
> > > + * The gfn is write-protected, but if emulation fails we can still
> > > + * optimistically try to just unprotect the page and let the processor
> > > + * re-execute the instruction that caused the page fault. Do not allow
> > > + * retrying MMIO emulation, as it's not only pointless but could also
> > > + * cause us to enter an infinite loop because the processor will keep
> > > + * faulting on the non-existent MMIO address. Retrying an instruction
> > > + * from a nested guest is also pointless and dangerous as we are only
> > > + * explicitly shadowing L1's page tables, i.e. unprotecting something
> > > + * for L1 isn't going to magically fix whatever issue cause L2 to fail.
> > > + */
> > > + if (!mmio_info_in_cache(vcpu, cr2_or_gpa, direct) && !is_guest_mode(vcpu))
> >
> > Looks the mmio_info_in_cache() checking can be removed,
> > emulation should not come here with RET_PF_WRITE_PROTECTED
> > introduced, may WARN_ON_ONCE() it.
>
> Yeah, that was my instinct as well. I kept it mostly because I liked having the
> comment, but also because I was thinking the cache could theoretically get a hit.
> But that's not true. KVM should return RET_PF_WRITE_PROTECTED if and only if
> there is a memslot, and creating a memslot is supposed to invalidate the MMIO
> cache by virtue of changing the memslot generation.
>
> Unless someone feels strongly that the mmio_info_in_cache() call should be
> deleted entirely, I'll tack on this patch. The cache lookup is cheap, and IMO
> it's helpful to explicitly document that it should be impossible to reach this
> point with what appears to be MMIO.
>
> ---
> arch/x86/kvm/mmu/mmu.c | 28 +++++++++++++++++++---------
> 1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 50695eb2ee22..7f3f57237f23 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5997,6 +5997,18 @@ static int kvm_mmu_write_protect_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> vcpu->arch.last_retry_eip = 0;
> vcpu->arch.last_retry_addr = 0;
>
> + /*
> + * It should be impossible to reach this point with an MMIO cache hit,
> + * as RET_PF_WRITE_PROTECTED is returned if and only if there's a valid,
> + * writable memslot, and creating a memslot should invalidate the MMIO
> + * cache by way of changing the memslot generation. WARN and disallow
> + * retry if MMIO is detect, as retrying MMIO emulation is pointless and
> + * could put the vCPU into an infinite loop because the processor will
> + * keep faulting on the non-existent MMIO address.
> + */
> + if (WARN_ON_ONCE(mmio_info_in_cache(vcpu, cr2_or_gpa, direct)))
> + return RET_PF_EMULATE;
> +
LGTM.
Reviewed-by: Yuan Yao <yuan.yao@...el.com>
> /*
> * Before emulating the instruction, check to see if the access may be
> * due to L1 accessing nested NPT/EPT entries used for L2, i.e. if the
> @@ -6029,17 +6041,15 @@ static int kvm_mmu_write_protect_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> return RET_PF_FIXED;
>
> /*
> - * The gfn is write-protected, but if emulation fails we can still
> - * optimistically try to just unprotect the page and let the processor
> + * The gfn is write-protected, but if KVM detects its emulating an
> + * instruction that is unlikely to be used to modify page tables, or if
> + * emulation fails, KVM can try to unprotect the gfn and let the CPU
> * re-execute the instruction that caused the page fault. Do not allow
> - * retrying MMIO emulation, as it's not only pointless but could also
> - * cause us to enter an infinite loop because the processor will keep
> - * faulting on the non-existent MMIO address. Retrying an instruction
> - * from a nested guest is also pointless and dangerous as we are only
> - * explicitly shadowing L1's page tables, i.e. unprotecting something
> - * for L1 isn't going to magically fix whatever issue cause L2 to fail.
> + * retrying an instruction from a nested guest as KVM is only explicitly
> + * shadowing L1's page tables, i.e. unprotecting something for L1 isn't
> + * going to magically fix whatever issue cause L2 to fail.
> */
> - if (!mmio_info_in_cache(vcpu, cr2_or_gpa, direct) && !is_guest_mode(vcpu))
> + if (!is_guest_mode(vcpu))
> *emulation_type |= EMULTYPE_ALLOW_RETRY_PF;
>
> return RET_PF_EMULATE;
>
> base-commit: 7d33880356496eb0640c6c825cc60898063c4902
> --
Powered by blists - more mailing lists