[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2bec792d-22aa-4c79-8324-2f801407a4eb@redhat.com>
Date: Wed, 14 Aug 2024 18:47:51 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Peter Gonda <pgonda@...gle.com>, Michael Roth <michael.roth@....com>,
Vishal Annapurve <vannapurve@...gle.com>,
Ackerly Tng <ackerleytng@...gle.com>
Subject: Re: [PATCH 04/22] KVM: x86/mmu: Skip emulation on page fault iff 1+
SPs were unprotected
On 8/9/24 21:03, Sean Christopherson wrote:
> When doing "fast unprotection" of nested TDP page tables, skip emulation
> if and only if at least one gfn was unprotected, i.e. continue with
> emulation if simply resuming is likely to hit the same fault and risk
> putting the vCPU into an infinite loop.
>
> Note, it's entirely possible to get a false negative, e.g. if a different
> vCPU faults on the same gfn and unprotects the gfn first, but that's a
> relatively rare edge case, and emulating is still functionally ok, i.e.
> the risk of putting the vCPU isn't an infinite loop isn't justified.
English snafu - "the risk of causing a livelock for the vCPU is
negligible", perhaps?
Paolo
> Fixes: 147277540bbc ("kvm: svm: Add support for additional SVM NPF error codes")
> Cc: stable@...r.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
> arch/x86/kvm/mmu/mmu.c | 28 ++++++++++++++++++++--------
> 1 file changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index e3aa04c498ea..95058ac4b78c 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5967,17 +5967,29 @@ static int kvm_mmu_write_protect_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> bool direct = vcpu->arch.mmu->root_role.direct;
>
> /*
> - * Before emulating the instruction, check if the error code
> - * was due to a RO violation while translating the guest page.
> - * This can occur when using nested virtualization with nested
> - * paging in both guests. If true, we simply unprotect the page
> - * and resume the guest.
> + * Before emulating the instruction, check to see if the access may be
> + * due to L1 accessing nested NPT/EPT entries used for L2, i.e. if the
> + * gfn being written is for gPTEs that KVM is shadowing and has write-
> + * protected. Because AMD CPUs walk nested page table using a write
> + * operation, walking NPT entries in L1 can trigger write faults even
> + * when L1 isn't modifying PTEs, and thus result in KVM emulating an
> + * excessive number of L1 instructions without triggering KVM's write-
> + * flooding detection, i.e. without unprotecting the gfn.
> + *
> + * If the error code was due to a RO violation while translating the
> + * guest page, the current MMU is direct (L1 is active), and KVM has
> + * shadow pages, then the above scenario is likely being hit. Try to
> + * unprotect the gfn, i.e. zap any shadow pages, so that L1 can walk
> + * its NPT entries without triggering emulation. If one or more shadow
> + * pages was zapped, skip emulation and resume L1 to let it natively
> + * execute the instruction. If no shadow pages were zapped, then the
> + * write-fault is due to something else entirely, i.e. KVM needs to
> + * emulate, as resuming the guest will put it into an infinite loop.
> */
> if (direct &&
> - (error_code & PFERR_NESTED_GUEST_PAGE) == PFERR_NESTED_GUEST_PAGE) {
> - kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(cr2_or_gpa));
> + (error_code & PFERR_NESTED_GUEST_PAGE) == PFERR_NESTED_GUEST_PAGE &&
> + kvm_mmu_unprotect_page(vcpu->kvm, gpa_to_gfn(cr2_or_gpa)))
> return RET_PF_FIXED;
> - }
>
> /*
> * The gfn is write-protected, but if emulation fails we can still
Powered by blists - more mailing lists