[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJmjG291w2ZPRiAevSzxGNcuR6vTuqyk6z4SG3xRsbaQh5U3zQ@mail.gmail.com>
Date: Wed, 3 Oct 2018 11:37:56 -0700
From: Daniel Wang <wonderfly@...gle.com>
To: rostedt@...dmis.org
Cc: Petr Mladek <pmladek@...e.com>, stable@...r.kernel.org,
Alexander.Levin@...rosoft.com, akpm@...ux-foundation.org,
byungchul.park@....com, dave.hansen@...el.com, hannes@...xchg.org,
jack@...e.cz, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Mel Gorman <mgorman@...e.de>, mhocko@...nel.org, pavel@....cz,
penguin-kernel@...ove.sakura.ne.jp, peterz@...radead.org,
tj@...nel.org, torvalds@...ux-foundation.org, vbabka@...e.cz,
Cong Wang <xiyou.wangcong@...il.com>,
Peter Feiner <pfeiner@...gle.com>
Subject: Re: 4.14 backport request for dbdda842fe96f: "printk: Add console
owner and waiter logic to load balance console writes"
On Wed, Oct 3, 2018 at 10:37 AM Steven Rostedt <rostedt@...dmis.org> wrote:
> Just so I understand correctly. Does the panic hit with and without the
> suggested backport patch? The only difference is that you get the full
> output with the patch and limited output without it?
When `softlockup_panic` is set (which is what my original repro had and
what we use in production), without the backport patch, the expected panic
would hit a seemingly deadlock. So even when the machine is configured
to reboot immediately after the panic (kernel.panic=-1), it just hangs there
with an incomplete backtrace. With your patch, the deadlock doesn't happen
and the machine reboots successfully.
This was and still is the issue this thread is trying to fix. The last
log snippet
was from an "experiment" that I did in order to understand what's really
happening. So far the speculation has been that the panic path was trying
to get a lock held by a backtrace dumping thread, but there is not enough
evidence which thread is holding the lock and how it uses it. So I set
`softlockup_panic` to 0, to get panic out of the equation. Then I saw that one
CPU was indeed holding the console lock, trying to write something out. If
the panic was to hit while it's doing that, we might get a deadlock.
>
> -- Steve
>
--
Best,
Daniel
Download attachment "smime.p7s" of type "application/pkcs7-signature" (4849 bytes)
Powered by blists - more mailing lists