[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1570462407.5576.292.camel@lca.pw>
Date: Mon, 07 Oct 2019 11:33:27 -0400
From: Qian Cai <cai@....pw>
To: Michal Hocko <mhocko@...nel.org>
Cc: Petr Mladek <pmladek@...e.com>, akpm@...ux-foundation.org,
sergey.senozhatsky.work@...il.com, rostedt@...dmis.org,
peterz@...radead.org, linux-mm@...ck.org,
john.ogness@...utronix.de, david@...hat.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] mm/page_isolation: fix a deadlock with printk()
On Mon, 2019-10-07 at 17:12 +0200, Michal Hocko wrote:
> On Mon 07-10-19 10:59:10, Qian Cai wrote:
> [...]
> > It is almost impossible to eliminate all the indirect call chains from
> > console_sem/console_owner_lock to zone->lock because it is too normal that
> > something later needs to allocate some memory dynamically, so as long as it
> > directly call printk() with zone->lock held, it will be in trouble.
>
> Do you have any example where the console driver really _has_ to
> allocate. Because I have hard time to believe this is going to work at
> all as the atomic context doesn't allow to do any memory reclaim and
> such an allocation would be too easy to fail so the allocation cannot
> really rely on it.
I don't know how to explain to you clearly, but let me repeat again one last
time. There is no necessary for console driver directly to allocate considering
this example,
CPU0: CPU1: CPU2: CPU3:
console_sem->lock zone->lock
pi->lock
pi->lock rq_lock
rq->lock
zone->lock
console_sem->lock
Here it only need someone held the rq_lock and allocate some memory. There is
also true for port_lock. Since the deadlock could involve a lot of CPUs and a
longer lock chain, it is impossible to predict which one to allocate some memory
while held a lock could end up with the same problematic lock chain.
>
> So again, crippling the MM code just because of lockdep false possitives
> or a broken console driver sounds like a wrong way to approach the
> problem.
>
> > [ 297.425964] -> #1 (&port_lock_key){-.-.}:
> > [ 297.425967] __lock_acquire+0x5b3/0xb40
> > [ 297.425967] lock_acquire+0x126/0x280
> > [ 297.425968] _raw_spin_lock_irqsave+0x3a/0x50
> > [ 297.425969] serial8250_console_write+0x3e4/0x450
> > [ 297.425970] univ8250_console_write+0x4b/0x60
> > [ 297.425970] console_unlock+0x501/0x750
> > [ 297.425971] vprintk_emit+0x10d/0x340
> > [ 297.425972] vprintk_default+0x1f/0x30
> > [ 297.425972] vprintk_func+0x44/0xd4
> > [ 297.425973] printk+0x9f/0xc5
> > [ 297.425974] register_console+0x39c/0x520
> > [ 297.425975] univ8250_console_init+0x23/0x2d
> > [ 297.425975] console_init+0x338/0x4cd
> > [ 297.425976] start_kernel+0x534/0x724
> > [ 297.425977] x86_64_start_reservations+0x24/0x26
> > [ 297.425977] x86_64_start_kernel+0xf4/0xfb
> > [ 297.425978] secondary_startup_64+0xb6/0xc0
>
> This is an early init code again so the lockdep sounds like a false
> possitive to me.
This is just a tip of iceberg to show the lock dependency,
console_owner --> port_lock_key
which could easily happen everywhere with a simple printk().
Powered by blists - more mailing lists