[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YlCNpQ9nkD1ToY13@google.com>
Date: Fri, 8 Apr 2022 19:31:49 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Paolo Bonzini <pbonzini@...hat.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, David Matlack <dmatlack@...gle.com>,
Ben Gardon <bgardon@...gle.com>
Subject: Re: [PATCH v2] KVM: x86/mmu: Update number of zapped pages even if
page list is stable
Very high latency ping, this is still problematic and still applies cleanly.
On Mon, Nov 29, 2021, Sean Christopherson wrote:
> When zapping obsolete pages, update the running count of zapped pages
> regardless of whether or not the list has become unstable due to zapping
> a shadow page with its own child shadow pages. If the VM is backed by
> mostly 4kb pages, KVM can zap an absurd number of SPTEs without bumping
> the batch count and thus without yielding. In the worst case scenario,
> this can cause a soft lokcup.
>
> watchdog: BUG: soft lockup - CPU#12 stuck for 22s! [dirty_log_perf_:13020]
> RIP: 0010:workingset_activation+0x19/0x130
> mark_page_accessed+0x266/0x2e0
> kvm_set_pfn_accessed+0x31/0x40
> mmu_spte_clear_track_bits+0x136/0x1c0
> drop_spte+0x1a/0xc0
> mmu_page_zap_pte+0xef/0x120
> __kvm_mmu_prepare_zap_page+0x205/0x5e0
> kvm_mmu_zap_all_fast+0xd7/0x190
> kvm_mmu_invalidate_zap_pages_in_memslot+0xe/0x10
> kvm_page_track_flush_slot+0x5c/0x80
> kvm_arch_flush_shadow_memslot+0xe/0x10
> kvm_set_memslot+0x1a8/0x5d0
> __kvm_set_memory_region+0x337/0x590
> kvm_vm_ioctl+0xb08/0x1040
>
> Fixes: fbb158cb88b6 ("KVM: x86/mmu: Revert "Revert "KVM: MMU: zap pages in batch""")
> Reported-by: David Matlack <dmatlack@...gle.com>
> Reviewed-by: Ben Gardon <bgardon@...gle.com>
> Cc: stable@...r.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
>
> v2:
> - Rebase to kvm/master, commit 30d7c5d60a88 ("KVM: SEV: expose...")
> - Collect Ben's review, modulo bad splat.
> - Copy+paste the correct splat and symptom. [David].
>
> @David, I kept the unstable declaration out of the loop, mostly because I
> really don't like putting declarations in loops, but also because
> nr_zapped is declared out of the loop and I didn't want to change that
> unnecessarily or make the code inconsistent.
>
> arch/x86/kvm/mmu/mmu.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 0c839ee1282c..208c892136bf 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5576,6 +5576,7 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm)
> {
> struct kvm_mmu_page *sp, *node;
> int nr_zapped, batch = 0;
> + bool unstable;
>
> restart:
> list_for_each_entry_safe_reverse(sp, node,
> @@ -5607,11 +5608,12 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm)
> goto restart;
> }
>
> - if (__kvm_mmu_prepare_zap_page(kvm, sp,
> - &kvm->arch.zapped_obsolete_pages, &nr_zapped)) {
> - batch += nr_zapped;
> + unstable = __kvm_mmu_prepare_zap_page(kvm, sp,
> + &kvm->arch.zapped_obsolete_pages, &nr_zapped);
> + batch += nr_zapped;
> +
> + if (unstable)
> goto restart;
> - }
> }
>
> /*
> --
> 2.34.0.rc2.393.gf8c9666880-goog
Powered by blists - more mailing lists