[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250111010409.1252942-4-seanjc@google.com>
Date: Fri, 10 Jan 2025 17:04:07 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Peter Xu <peterx@...hat.com>, Yan Zhao <yan.y.zhao@...el.com>,
Maxim Levitsky <mlevitsk@...hat.com>, Sean Christopherson <seanjc@...gle.com>
Subject: [PATCH 3/5] KVM: Conditionally reschedule when resetting the dirty ring
When resetting a dirty ring, conditionally reschedule on each iteration
after the first. The recently introduced hard limit mitigates the issue
of an endless reset, but isn't sufficient to completely prevent RCU
stalls, soft lockups, etc., nor is the hard limit intended to guard
against such badness.
Note! Take care to check for reschedule even in the "continue" paths,
as a pathological scenario (or malicious userspace) could dirty the same
gfn over and over, i.e. always hit the continue path.
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 4-....: (5249 ticks this GP) idle=51e4/1/0x4000000000000000 softirq=309/309 fqs=2563
rcu: (t=5250 jiffies g=-319 q=608 ncpus=24)
CPU: 4 UID: 1000 PID: 1067 Comm: dirty_log_test Tainted: G L 6.13.0-rc3-17fa7a24ea1e-HEAD-vm #814
Tainted: [L]=SOFTLOCKUP
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kvm_arch_mmu_enable_log_dirty_pt_masked+0x26/0x200 [kvm]
Call Trace:
<TASK>
kvm_reset_dirty_gfn.part.0+0xb4/0xe0 [kvm]
kvm_dirty_ring_reset+0x58/0x220 [kvm]
kvm_vm_ioctl+0x10eb/0x15d0 [kvm]
__x64_sys_ioctl+0x8b/0xb0
do_syscall_64+0x5b/0x160
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
Tainted: [L]=SOFTLOCKUP
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kvm_arch_mmu_enable_log_dirty_pt_masked+0x17/0x200 [kvm]
Call Trace:
<TASK>
kvm_reset_dirty_gfn.part.0+0xb4/0xe0 [kvm]
kvm_dirty_ring_reset+0x58/0x220 [kvm]
kvm_vm_ioctl+0x10eb/0x15d0 [kvm]
__x64_sys_ioctl+0x8b/0xb0
do_syscall_64+0x5b/0x160
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking")
Signed-off-by: Sean Christopherson <seanjc@...gle.com>
---
virt/kvm/dirty_ring.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c
index a81ad17d5eef..37eb2b7142bd 100644
--- a/virt/kvm/dirty_ring.c
+++ b/virt/kvm/dirty_ring.c
@@ -133,6 +133,16 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring,
ring->reset_index++;
(*nr_entries_reset)++;
+
+ /*
+ * While the size of each ring is fixed, it's possible for the
+ * ring to be constantly re-dirtied/harvested while the reset
+ * is in-progress (the hard limit exists only to guard against
+ * wrapping the count into negative space).
+ */
+ if (!first_round)
+ cond_resched();
+
/*
* Try to coalesce the reset operations when the guest is
* scanning pages in the same slot.
--
2.47.1.613.gc27f4b7a9f-goog
Powered by blists - more mailing lists