[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <023e131b3de80c4bc2b6711804a4769466b90c6f.camel@infradead.org>
Date: Fri, 13 Jan 2023 10:33:33 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Paolo Bonzini <pbonzini@...hat.com>,
Boqun Feng <boqun.feng@...il.com>,
"Paul E. McKenney" <paulmck@...nel.org>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
seanjc@...gle.com, Joel Fernandes <joel@...lfernandes.org>,
Matthew Wilcox <willy@...radead.org>,
Josh Triplett <josh@...htriplett.org>, rcu@...r.kernel.org,
Michal Luczaj <mhal@...x.co>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] Documentation: kvm: fix SRCU locking order docs
On Fri, 2023-01-13 at 10:20 +0100, Paolo Bonzini wrote:
> On 1/13/23 08:18, Boqun Feng wrote:
> > On Thu, Jan 12, 2023 at 07:20:48AM -0800, Paul E. McKenney wrote:
> > > On Thu, Jan 12, 2023 at 08:24:16AM +0000, David Woodhouse wrote:
> > > > On Wed, 2023-01-11 at 13:30 -0500, Paolo Bonzini wrote:
> > > > >
> > > > > +- ``synchronize_srcu(&kvm->srcu)`` is called inside critical sections
> > > > > + for kvm->lock, vcpu->mutex and kvm->slots_lock. These locks _cannot_
> > > > > + be taken inside a kvm->srcu read-side critical section; that is, the
> > > > > + following is broken::
> > > > > +
> > > > > + srcu_read_lock(&kvm->srcu);
> > > > > + mutex_lock(&kvm->slots_lock);
> > > > > +
> > > >
> > > > "Don't tell me. Tell lockdep!"
> > > >
> > > > Did we conclude in
> > > > https://lore.kernel.org/kvm/122f38e724aae9ae8ab474233da1ba19760c20d2.camel@infradead.org/
> > > > that lockdep *could* be clever enough to catch a violation of this rule
> > > > by itself?
> > > >
> > > > The general case of the rule would be that 'if mutex A is taken in a
> > > > read-section for SCRU B, then any synchronize_srcu(B) while mutex A is
> > > > held shall be verboten'. And vice versa.
> > > >
> > > > If we can make lockdep catch it automatically, yay!
> > >
> > > Unfortunately, lockdep needs to see a writer to complain, and that patch
> > > just adds a reader. And adding that writer would make lockdep complain
> > > about things that are perfectly fine. It should be possible to make
> > > lockdep catch this sort of thing, but from what I can see, doing so
> > > requires modifications to lockdep itself.
> > >
> >
> > Please see if the follow patchset works:
> >
> > https://lore.kernel.org/lkml/20230113065955.815667-1-boqun.feng@gmail.com
> >
> > "I have been called. I must answer. Always." ;-)
Amazing! Thank you, Boqun!
> It's missing an important testcase; if it passes (does not warn), then
> it should work:
I think it does.
I started with kvm/master from
https://git.kernel.org/pub/scm/virt/kvm/kvm.git/log/?h=master so that
lockdep didn't find other things to complain about first. I then:
• Dropped the last two commits, putting us back to using kvm->lock and
removing the dummy mutex lock that would have told lockdep that it's
a (different) problem.
• I then added Boqun's three commits
• Reverted a79b53aa so that the srcu_synchronize() deadlock returns
• Couldn't reproduce with xen_shinfo_test, so added Michal's test from
https://lore.kernel.org/kvm/15599980-bd2e-b6c2-1479-e1eef02da0b5@rbox.co/
The resulting tree is at
https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/kvm-srcu-lockdep
Now when I run tools/testing/selftests/kvm/x86_64/deadlocks_test I do
see lockdep complain about it (shown below; I have a cosmetic
nit/request to make). If I restore the evtchn_reset fix then it also
complains about kvm_xen_set_evtchn() vs. kvm_xen_kvm_set_attr(), and if
I restore the xen_lock fix from the tip of kvm/master then Michal's
deadlock_test passes and there are no complaints.
So everything seems to be working as it should... *except* for the fact
that I don't quite understand why xen_shinfo_test didn't trigger the
warning. Michal, I guess you already worked that out when you came up
with your deadlock-test instead... is there something we should add to
xen_shinfo_test that would mean it *would* have triggered? I even tried
this:
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1173,6 +1173,16 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
if (init_srcu_struct(&kvm->irq_srcu))
goto out_err_no_irq_srcu;
+#ifdef CONFIG_LOCKDEP
+ /*
+ * Ensure lockdep knows that it's not permitted to lock kvm->lock
+ * from a SRCU read section on kvm->srcu.
+ */
+ mutex_lock(&kvm->lock);
+ synchronize_srcu(&kvm->srcu);
+ mutex_unlock(&kvm->lock);
+#endif
+
refcount_set(&kvm->users_count, 1);
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
for (j = 0; j < 2; j++) {
> [ 845.474169] ======================================================
> [ 845.474170] WARNING: possible circular locking dependency detected
> [ 845.474172] 6.2.0-rc3+ #1025 Tainted: G E
> [ 845.474175] ------------------------------------------------------
> [ 845.474176] deadlocks_test/22767 is trying to acquire lock:
> [ 845.474178] ffffc9000ba4b868 (&kvm->srcu){.+.+}-{0:0}, at: __synchronize_srcu+0x5/0x170
> [ 845.474192]
> but task is already holding lock:
> [ 845.474194] ffffc9000ba423c0 (&kvm->lock){+.+.}-{3:3}, at: kvm_vm_ioctl_set_msr_filter+0x188/0x220 [kvm]
> [ 845.474319]
> which lock already depends on the new lock.
So the above part is good, and clearly tells me it was synchronize_srcu()
> [ 845.474320]
> the existing dependency chain (in reverse order) is:
> [ 845.474322]
> -> #1 (&kvm->lock){+.+.}-{3:3}:
> [ 845.474327] __lock_acquire+0x4b4/0x940
> [ 845.474333] lock_acquire.part.0+0xa8/0x210
> [ 845.474337] __mutex_lock+0x94/0x920
> [ 845.474344] kvm_xen_set_evtchn.part.0+0x6d/0x170 [kvm]
> [ 845.474437] kvm_xen_inject_timer_irqs+0x79/0xa0 [kvm]
> [ 845.474529] vcpu_run+0x20c/0x450 [kvm]
> [ 845.474618] kvm_arch_vcpu_ioctl_run+0x1df/0x670 [kvm]
> [ 845.474707] kvm_vcpu_ioctl+0x279/0x700 [kvm]
> [ 845.474783] __x64_sys_ioctl+0x8a/0xc0
> [ 845.474787] do_syscall_64+0x3b/0x90
> [ 845.474796] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 845.474804]
> -> #0 (&kvm->srcu){.+.+}-{0:0}:
> [ 845.474809] check_prev_add+0x8f/0xc20
> [ 845.474812] validate_chain+0x3ba/0x450
> [ 845.474814] __lock_acquire+0x4b4/0x940
> [ 845.474817] lock_sync+0x99/0x110
> [ 845.474820] __synchronize_srcu+0x4d/0x170
> [ 845.474824] kvm_vm_ioctl_set_msr_filter+0x1a5/0x220 [kvm]
. [ 845.474907] kvm_arch_vm_ioctl+0x8df/0xd50 [kvm]
> [ 845.474997] kvm_vm_ioctl+0x5ca/0x800 [kvm]
> [ 845.475075] __x64_sys_ioctl+0x8a/0xc0
> [ 845.475079] do_syscall_64+0x3b/0x90
> [ 845.475084] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 845.475089]
> other info that might help us debug this:
>
> [ 845.475091] Possible unsafe locking scenario:
>
> [ 845.475092] CPU0 CPU1
> [ 845.475093] ---- ----
> [ 845.475095] lock(&kvm->lock);
> [ 845.475098] lock(&kvm->srcu);
> [ 845.475101] lock(&kvm->lock);
> [ 845.475103] lock(&kvm->srcu);
> [ 845.475106]
> *** DEADLOCK ***
But is there any chance the above could say 'synchronize_srcu' and
'read_lock_srcu' in the appropriate places?
> [ 845.475108] 1 lock held by deadlocks_test/22767:
> [ 845.475110] #0: ffffc9000ba423c0 (&kvm->lock){+.+.}-{3:3}, at: kvm_vm_ioctl_set_msr_filter+0x188/0x220 [kvm]
> [ 845.475200]
> stack backtrace:
> [ 845.475202] CPU: 10 PID: 22767 Comm: deadlocks_test Tainted: G E 6.2.0-rc3+ #1025
> [ 845.475206] Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
> [ 845.475208] Call Trace:
> [ 845.475210] <TASK>
> [ 845.475214] dump_stack_lvl+0x56/0x73
> [ 845.475221] check_noncircular+0x102/0x120
> [ 845.475229] ? check_noncircular+0x7f/0x120
> [ 845.475236] check_prev_add+0x8f/0xc20
> [ 845.475239] ? add_chain_cache+0x10b/0x2d0
> [ 845.475244] validate_chain+0x3ba/0x450
> [ 845.475249] __lock_acquire+0x4b4/0x940
> [ 845.475253] ? __synchronize_srcu+0x5/0x170
> [ 845.475258] lock_sync+0x99/0x110
> [ 845.475261] ? __synchronize_srcu+0x5/0x170
> [ 845.475265] __synchronize_srcu+0x4d/0x170
? [ 845.475269] ? mark_held_locks+0x49/0x80
> [ 845.475272] ? _raw_spin_unlock_irqrestore+0x2d/0x60
> [ 845.475278] ? __pfx_read_tsc+0x10/0x10
> [ 845.475286] ? ktime_get_mono_fast_ns+0x3d/0x90
> [ 845.475292] kvm_vm_ioctl_set_msr_filter+0x1a5/0x220 [kvm]
> [ 845.475380] kvm_arch_vm_ioctl+0x8df/0xd50 [kvm]
> [ 845.475472] ? __lock_acquire+0x4b4/0x940
> [ 845.475485] kvm_vm_ioctl+0x5ca/0x800 [kvm]
> [ 845.475566] ? lockdep_unregister_key+0x76/0x110
> [ 845.475575] __x64_sys_ioctl+0x8a/0xc0
> [ 845.475579] do_syscall_64+0x3b/0x90
> [ 845.475586] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 845.475591] RIP: 0033:0x7f79de23fd1b
> [ 845.475595] Code: 73 01 c3 48 8b 0d 05 a1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 a0 1b 00 f7 d8 64 89 01 48
> [ 845.475598] RSP: 002b:00007f79ddff7c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> [ 845.475602] RAX: ffffffffffffffda RBX: 00007f79ddff8640 RCX: 00007f79de23fd1b
> [ 845.475605] RDX: 00007f79ddff7ca0 RSI: 000000004188aec6 RDI: 0000000000000004
> [ 845.475607] RBP: 00007f79ddff85c0 R08: 0000000000000000 R09: 00007fffceb1ff2f
> [ 845.475609] R10: 0000000000000008 R11: 0000000000000246 R12: 00007f79ddff7ca0
> [ 845.475611] R13: 0000000001c322a0 R14: 00007f79de2a05f0 R15: 0000000000000000
> [ 845.475617] </TASK>
Download attachment "smime.p7s" of type "application/pkcs7-signature" (5965 bytes)
Powered by blists - more mailing lists