[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1574456603.9585.28.camel@lca.pw>
Date: Fri, 22 Nov 2019 16:03:23 -0500
From: Qian Cai <cai@....pw>
To: Peter Zijlstra <peterz@...radead.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, linux-tip-commits@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
akpm@...ux-foundation.org, cl@...ux.com, keescook@...omium.org,
penberg@...nel.org, rientjes@...gle.com, thgarnie@...gle.com,
tytso@....edu, will@...nel.org, Ingo Molnar <mingo@...nel.org>,
Borislav Petkov <bp@...en8.de>
Subject: Re: [tip: sched/urgent] sched/core: Avoid spurious lock dependencies
On Fri, 2019-11-22 at 21:20 +0100, Peter Zijlstra wrote:
> On Fri, Nov 22, 2019 at 09:01:22PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2019-11-13 10:06:28 [-0000], tip-bot2 for Peter Zijlstra wrote:
> > > sched/core: Avoid spurious lock dependencies
> > >
> > > While seemingly harmless, __sched_fork() does hrtimer_init(), which,
> > > when DEBUG_OBJETS, can end up doing allocations.
> > >
> > > This then results in the following lock order:
> > >
> > > rq->lock
> > > zone->lock.rlock
> > > batched_entropy_u64.lock
> > >
> > > Which in turn causes deadlocks when we do wakeups while holding that
> > > batched_entropy lock -- as the random code does.
> >
> > Peter, can it _really_ cause deadlocks? My understanding was that the
> > batched_entropy_u64.lock is a per-CPU lock and can _not_ cause a
> > deadlock because it can be always acquired on multiple CPUs
> > simultaneously (and it is never acquired cross-CPU).
> > Lockdep is simply not smart enough to see that and complains about it
> > like it would complain about a regular lock in this case.
>
> That part yes. That is, even holding a per-cpu lock you can do a wakeup
> to the local cpu and recurse back onto rq->lock.
>
> However I don't think it can actually happen bceause this
> is init_idle, and we only ever do that on hotplug, so actually creating
> the concurrency required for the deadlock might be tricky.
>
> Still, moving that thing out from under the lock was simple and correct.
Well, the patch alone fixed a real deadlock during boot.
https://lore.kernel.org/lkml/1566509603.5576.10.camel@lca.pw/
It needs DEBUG_OBJECTS=y to trigger though.
Suppose it does,
CPU0: zone_lock -> prink() [1]
CPUs: printk() -> zone_lock [2]
[1]
[ 1078.599835][T43784] -> #1 (console_owner){-...}:
[ 1078.606618][T43784] __lock_acquire+0x5c8/0xbb0
[ 1078.611661][T43784] lock_acquire+0x154/0x428
[ 1078.616530][T43784] console_unlock+0x298/0x898
[ 1078.621573][T43784] vprintk_emit+0x2d4/0x460
[ 1078.626442][T43784] vprintk_default+0x48/0x58
[ 1078.631398][T43784] vprintk_func+0x194/0x250
[ 1078.636267][T43784] printk+0xbc/0xec
[ 1078.640443][T43784] _warn_unseeded_randomness+0xb4/0xd0
[ 1078.646267][T43784] get_random_u64+0x4c/0x100
[ 1078.651224][T43784] add_to_free_area_random+0x168/0x1a0
[ 1078.657047][T43784] free_one_page+0x3dc/0xd08
[2]
[ 317.337609] -> #3 (&(&zone->lock)->rlock){-.-.}:
[ 317.337612] __lock_acquire+0x5b3/0xb40
[ 317.337613] lock_acquire+0x126/0x280
[ 317.337613] _raw_spin_lock+0x2f/0x40
[ 317.337614] rmqueue_bulk.constprop.21+0xb6/0x1160
[ 317.337615] get_page_from_freelist+0x898/0x22c0
[ 317.337616] __alloc_pages_nodemask+0x2f3/0x1cd0
[ 317.337617] alloc_page_interleave+0x18/0x130
[ 317.337618] alloc_pages_current+0xf6/0x110
[ 317.337619] allocate_slab+0x4c6/0x19c0
[ 317.337620] new_slab+0x46/0x70
[ 317.337621] ___slab_alloc+0x58b/0x960
[ 317.337621] __slab_alloc+0x43/0x70
[ 317.337622] kmem_cache_alloc+0x354/0x460
[ 317.337623] fill_pool+0x272/0x4b0
[ 317.337624] __debug_object_init+0x86/0x790
[ 317.337624] debug_object_init+0x16/0x20
[ 317.337625] hrtimer_init+0x27/0x1e0
[ 317.337626] init_dl_task_timer+0x20/0x40
[ 317.337627] __sched_fork+0x10b/0x1f0
[ 317.337627] init_idle+0xac/0x520
[ 317.337628] idle_thread_get+0x7c/0xc0
[ 317.337629] bringup_cpu+0x1a/0x1e0
[ 317.337630] cpuhp_invoke_callback+0x197/0x1120
[ 317.337630] _cpu_up+0x171/0x280
[ 317.337631] do_cpu_up+0xb1/0x120
[ 317.337632] cpu_up+0x13/0x20
[ 317.337635] -> #2 (&rq->lock){-.-.}:
[ 317.337638] __lock_acquire+0x5b3/0xb40
[ 317.337639] lock_acquire+0x126/0x280
[ 317.337639] _raw_spin_lock+0x2f/0x40
[ 317.337640] task_fork_fair+0x43/0x200
[ 317.337641] sched_fork+0x29b/0x420
[ 317.337642] copy_process+0xf3c/0x2fd0
[ 317.337642] _do_fork+0xef/0x950
[ 317.337643] kernel_thread+0xa8/0xe0
[ 317.337649] -> #1 (&p->pi_lock){-.-.}:
[ 317.337651] __lock_acquire+0x5b3/0xb40
[ 317.337652] lock_acquire+0x126/0x280
[ 317.337653] _raw_spin_lock_irqsave+0x3a/0x50
[ 317.337653] try_to_wake_up+0xb4/0x1030
[ 317.337654] wake_up_process+0x15/0x20
[ 317.337655] __up+0xaa/0xc0
[ 317.337655] up+0x55/0x60
[ 317.337656] __up_console_sem+0x37/0x60
[ 317.337657] console_unlock+0x3a0/0x750
[ 317.337658] vprintk_emit+0x10d/0x340
[ 317.337658] vprintk_default+0x1f/0x30
[ 317.337659] vprintk_func+0x44/0xd4
[ 317.337660] printk+0x9f/0xc5
Powered by blists - more mailing lists