[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200325023506.GB241329@google.com>
Date: Wed, 25 Mar 2020 11:35:06 +0900
From: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
To: Zygo Blaxell <uixjjji1@...il.furryterror.org>
Cc: Qian Cai <cai@....pw>, tytso@....edu, arnd@...db.de,
gregkh@...uxfoundation.org, sergey.senozhatsky.work@...il.com,
pmladek@...e.com, rostedt@...dmis.org, catalin.marinas@....com,
will@...nel.org, dan.j.williams@...el.com, peterz@...radead.org,
longman@...hat.com, tglx@...utronix.de, linux-mm@...ck.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: dmesg -w regression in v5.4.22, bisected, was: Re: [PATCH]
char/random: silence a lockdep splat with printk()
On (20/03/24 11:13), Zygo Blaxell wrote:
> On Wed, Nov 13, 2019 at 04:16:25PM -0500, Qian Cai wrote:
> > From: Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
> >
> > Sergey didn't like the locking order,
> >
> > uart_port->lock -> tty_port->lock
> >
> > uart_write (uart_port->lock)
> > __uart_start
> > pl011_start_tx
> > pl011_tx_chars
> > uart_write_wakeup
> > tty_port_tty_wakeup
> > tty_port_default
> > tty_port_tty_get (tty_port->lock)
> >
> > but those code is so old, and I have no clue how to de-couple it after
> > checking other locks in the splat. There is an onging effort to make all
> > printk() as deferred, so until that happens, workaround it for now as a
> > short-term fix.
>
> Starting with v5.4.22 I noticed 'dmesg -w' stopped working on some
> machines. dmesg will follow console output for a few seconds, then it
> stops. strace indicates dmesg is blocked in read() on the /dev/kmsg fd.
> If a new dmesg process starts, it gives messages for a few seconds,
> then also stops. rsyslog's kernel logging is similarly affected.
>
> Bisection points to this patch (now known as
> 1b710b1b10eff9d46666064ea25f079f70bc67a8 upstream). I can't reproduce
> the problem on a test VM, and some machines are running v5.4.22..v5.4.26
> with no dmesg problems. It seems there is some magic in the startup
> sequence of affected machines. This code isn't executed after RNG is
> seeded, so it would have to get its bad stuff done before that happens.
>
> Reverting commit 1b710b1b10eff9d46666064ea25f079f70bc67a8 fixes the
> dmesg regression on 5.4.26. It might put the original lockdep bug back,
> but on machines running stable kernels, I prefer randomly broken lockdep
> over repeatably broken dmesg.
This should fix the problem
https://lore.kernel.org/lkml/20200303113002.63089-1-sergey.senozhatsky@gmail.com
-ss
Powered by blists - more mailing lists