[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1236de05e704f0a0b28dc0ad75f9ad4d81b7a057.camel@gmx.de>
Date: Thu, 22 Oct 2020 07:21:13 +0200
From: Mike Galbraith <efault@....de>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
linux-rt-users <linux-rt-users@...r.kernel.org>,
Steven Rostedt <rostedt@...dmis.org>
Subject: ltp or kvm triggerable lockdep alloc_pid() deadlock gripe
Greetings,
The gripe below is repeatable in two ways here, boot with nomodeset so
nouveau doesn't steal the lockdep show when I then fire up one of my
(oink) full distro VM's, or from an ltp directory ./runltp -f cpuset
with the attached subset of controllers file placed in ./runtest dir.
Method2 may lead to a real deal deadlock, I've got a crashdump of one,
stack traces of uninterruptible sleepers attached.
[ 154.927302] ======================================================
[ 154.927303] WARNING: possible circular locking dependency detected
[ 154.927304] 5.9.1-rt18-rt #5 Tainted: G S E
[ 154.927305] ------------------------------------------------------
[ 154.927306] cpuset_inherit_/4992 is trying to acquire lock:
[ 154.927307] ffff9d334c5e64d8 (&s->seqcount){+.+.}-{0:0}, at: __slab_alloc.isra.87+0xad/0xc0
[ 154.927317]
but task is already holding lock:
[ 154.927317] ffffffffac4052d0 (pidmap_lock){+.+.}-{2:2}, at: alloc_pid+0x1fb/0x510
[ 154.927324]
which lock already depends on the new lock.
[ 154.927324]
the existing dependency chain (in reverse order) is:
[ 154.927325]
-> #1 (pidmap_lock){+.+.}-{2:2}:
[ 154.927328] lock_acquire+0x92/0x410
[ 154.927331] rt_spin_lock+0x2b/0xc0
[ 154.927335] free_pid+0x27/0xc0
[ 154.927338] release_task+0x34a/0x640
[ 154.927340] do_exit+0x6e9/0xcf0
[ 154.927342] kthread+0x11c/0x190
[ 154.927344] ret_from_fork+0x1f/0x30
[ 154.927347]
-> #0 (&s->seqcount){+.+.}-{0:0}:
[ 154.927350] validate_chain+0x981/0x1250
[ 154.927352] __lock_acquire+0x86f/0xbd0
[ 154.927354] lock_acquire+0x92/0x410
[ 154.927356] ___slab_alloc+0x71b/0x820
[ 154.927358] __slab_alloc.isra.87+0xad/0xc0
[ 154.927359] kmem_cache_alloc+0x700/0x8c0
[ 154.927361] radix_tree_node_alloc.constprop.22+0xa2/0xf0
[ 154.927365] idr_get_free+0x207/0x2b0
[ 154.927367] idr_alloc_u32+0x54/0xa0
[ 154.927369] idr_alloc_cyclic+0x4f/0xa0
[ 154.927370] alloc_pid+0x22b/0x510
[ 154.927372] copy_process+0xeb5/0x1de0
[ 154.927375] _do_fork+0x52/0x750
[ 154.927377] __do_sys_clone+0x64/0x70
[ 154.927379] do_syscall_64+0x33/0x40
[ 154.927382] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 154.927384]
other info that might help us debug this:
[ 154.927384] Possible unsafe locking scenario:
[ 154.927385] CPU0 CPU1
[ 154.927386] ---- ----
[ 154.927386] lock(pidmap_lock);
[ 154.927388] lock(&s->seqcount);
[ 154.927389] lock(pidmap_lock);
[ 154.927391] lock(&s->seqcount);
[ 154.927392]
*** DEADLOCK ***
[ 154.927393] 4 locks held by cpuset_inherit_/4992:
[ 154.927394] #0: ffff9d33decea5b0 ((lock).lock){+.+.}-{2:2}, at: __radix_tree_preload+0x52/0x3b0
[ 154.927399] #1: ffffffffac598fa0 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0
[ 154.927405] #2: ffffffffac4052d0 (pidmap_lock){+.+.}-{2:2}, at: alloc_pid+0x1fb/0x510
[ 154.927409] #3: ffffffffac598fa0 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0
[ 154.927414]
stack backtrace:
[ 154.927416] CPU: 3 PID: 4992 Comm: cpuset_inherit_ Kdump: loaded Tainted: G S E 5.9.1-rt18-rt #5
[ 154.927418] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
[ 154.927419] Call Trace:
[ 154.927422] dump_stack+0x77/0x9b
[ 154.927425] check_noncircular+0x148/0x160
[ 154.927432] ? validate_chain+0x981/0x1250
[ 154.927435] validate_chain+0x981/0x1250
[ 154.927441] __lock_acquire+0x86f/0xbd0
[ 154.927446] lock_acquire+0x92/0x410
[ 154.927449] ? __slab_alloc.isra.87+0xad/0xc0
[ 154.927452] ? kmem_cache_alloc+0x648/0x8c0
[ 154.927453] ? lock_acquire+0x92/0x410
[ 154.927458] ___slab_alloc+0x71b/0x820
[ 154.927460] ? __slab_alloc.isra.87+0xad/0xc0
[ 154.927463] ? radix_tree_node_alloc.constprop.22+0xa2/0xf0
[ 154.927468] ? __slab_alloc.isra.87+0x83/0xc0
[ 154.927472] ? radix_tree_node_alloc.constprop.22+0xa2/0xf0
[ 154.927474] ? __slab_alloc.isra.87+0xad/0xc0
[ 154.927476] __slab_alloc.isra.87+0xad/0xc0
[ 154.927480] ? radix_tree_node_alloc.constprop.22+0xa2/0xf0
[ 154.927482] kmem_cache_alloc+0x700/0x8c0
[ 154.927487] radix_tree_node_alloc.constprop.22+0xa2/0xf0
[ 154.927491] idr_get_free+0x207/0x2b0
[ 154.927495] idr_alloc_u32+0x54/0xa0
[ 154.927500] idr_alloc_cyclic+0x4f/0xa0
[ 154.927503] alloc_pid+0x22b/0x510
[ 154.927506] ? copy_thread+0x88/0x200
[ 154.927512] copy_process+0xeb5/0x1de0
[ 154.927520] _do_fork+0x52/0x750
[ 154.927523] ? lock_acquire+0x92/0x410
[ 154.927525] ? __might_fault+0x3e/0x90
[ 154.927530] ? find_held_lock+0x2d/0x90
[ 154.927535] __do_sys_clone+0x64/0x70
[ 154.927541] do_syscall_64+0x33/0x40
[ 154.927544] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 154.927546] RIP: 0033:0x7f0b357356e3
[ 154.927548] Code: db 45 85 ed 0f 85 ad 01 00 00 64 4c 8b 04 25 10 00 00 00 31 d2 4d 8d 90 d0 02 00 00 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 f1 00 00 00 85 c0 41 89 c4 0f 85 fe 00 00
[ 154.927550] RSP: 002b:00007ffdfd6d15f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[ 154.927552] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0b357356e3
[ 154.927554] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[ 154.927555] RBP: 00007ffdfd6d1620 R08: 00007f0b36052b80 R09: 0000000000000072
[ 154.927556] R10: 00007f0b36052e50 R11: 0000000000000246 R12: 0000000000000000
[ 154.927557] R13: 0000000000000000 R14: 0000000000000000 R15: 00005614ef57ecf0
View attachment "cpuset" of type "text/plain" (587 bytes)
View attachment "deadlock-log" of type "text/plain" (14019 bytes)
Powered by blists - more mailing lists