[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240819173453.GB2210585.vipinsh@google.com>
Date: Mon, 19 Aug 2024 10:34:53 -0700
From: Vipin Sharma <vipinsh@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: pbonzini@...hat.com, dmatlack@...gle.com, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] KVM: x86/mmu: Recover NX Huge pages belonging to TDP
MMU under MMU read lock
On 2024-08-16 16:38:05, Sean Christopherson wrote:
> On Mon, Aug 12, 2024, Vipin Sharma wrote:
> > @@ -1831,12 +1845,17 @@ ulong kvm_tdp_mmu_recover_nx_huge_pages(struct kvm *kvm, ulong to_zap)
> > * recovered, along with all the other huge pages in the slot,
> > * when dirty logging is disabled.
> > */
> > - if (kvm_mmu_sp_dirty_logging_enabled(kvm, sp))
> > + if (kvm_mmu_sp_dirty_logging_enabled(kvm, sp)) {
> > + spin_lock(&kvm->arch.tdp_mmu_pages_lock);
> > unaccount_nx_huge_page(kvm, sp);
> > - else
> > - flush |= kvm_tdp_mmu_zap_sp(kvm, sp);
> > -
> > - WARN_ON_ONCE(sp->nx_huge_page_disallowed);
> > + spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
> > + to_zap--;
> > + WARN_ON_ONCE(sp->nx_huge_page_disallowed);
> > + } else if (tdp_mmu_zap_sp(kvm, sp)) {
> > + flush = true;
> > + to_zap--;
>
> This is actively dangerous. In the old code, tdp_mmu_zap_sp() could fail only
> in a WARN-able scenario, i.e. practice was guaranteed to succeed. And, the
> for-loop *always* decremented to_zap, i.e. couldn't get stuck in an infinite
> loop.
>
> Neither of those protections exist in this version. Obviously it shouldn't happen,
> but it's possible this could flail on the same SP over and over, since nothing
> guarnatees forward progress. The cond_resched() would save KVM from true pain,
> but it's still a wart in the implementation.
>
> Rather than loop on to_zap, just do
>
> list_for_each_entry(...) {
>
> if (!to_zap)
> break;
> }
>
> And if we don't use separate lists, that'll be an improvement too, as it KVM
> will only have to skip "wrong" shadow pages once, whereas this approach means
> every iteration of the loop has to walk past the "wrong" shadow pages.
TDP MMU accesses possible_nx_huge_pages using tdp_mmu_pages_lock spin
lock. We cannot hold it for recovery duration.
In this patch, tdp_mmu_zap_sp() has been modified to retry failures,
which is similar to other retry mechanism in TDP MMU. Won't it be the
same issue with other TDP MMU retry flows?
>
> But I'd still prefer to use separate lists.
Powered by blists - more mailing lists