linux-kernel - Re: [GIT PULL] printk for 6.11

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ed7jvo2c.fsf@jogness.linutronix.de>
Date: Tue, 23 Jul 2024 22:47:15 +0206
From: John Ogness <john.ogness@...utronix.de>
To: Linus Torvalds <torvalds@...ux-foundation.org>, Petr Mladek
 <pmladek@...e.com>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>, Steven Rostedt
 <rostedt@...dmis.org>, Andy Shevchenko
 <andriy.shevchenko@...ux.intel.com>, Rasmus Villemoes
 <linux@...musvillemoes.dk>, Sebastian Andrzej Siewior
 <bigeasy@...utronix.de>, Thomas Gleixner <tglx@...utronix.de>, Jan Kara
 <jack@...e.cz>, Peter Zijlstra <peterz@...radead.org>,
 linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL] printk for 6.11

Hi Linus,

On 2024-07-23, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>>   - In an emergency section, directly from nbcon_cpu_emergency_exit()
>>     or nbcon_cpu_emergency_flush(). It allows to see the messages
>>     when the system is in an unexpected state and might not be
>>     able to continue properly.
>>
>>     The messages are flushed at the end of the emergency section
>>     to allow storing the full log (backtrace) first.
>
> What? No.
>
> One of the historically problematic situations is when a recursive
> oops or a deadlock occurs *during* the first oops.
>
> The "recursive oops" may be simple to sort out by forcing a flush at
> that point, in that hopefully the machine is "alive", but what about
> random deadlocks or other situations where the printk machinery simply
> is never ever entered again?
>
> And we most definitely have had exactly that happen due to the call
> trace code etc.
>
> At that point, it's ok if the machine is dead (this is obviously a
> very catastrophic situation - nobody should worry about how to
> continue), but it's really important that the first problem report
> makes it out.
>
> The whole notion of "to allow storing the full log (backtrace) first"
> is completely crazy. It's entirely secondary whether you have a full
> log or not, when the primary goal MUST BE that you have any output at
> all!
>
> How can this have _continued_ to be unclear, when it was my one hard
> requirement for this whole thing from day one? My *ONE* requirement
> has always been that the printk code ALWAYS does its absolute best to
> print out problem reports.
>
> Because when an oops happen, all other rules go out the window.
>
> We no longer care about "what pretty printouts", and we should strive
> to always try to just get at least *some* basic print out. The kernel
> is known to not be in a great state, and maybe the printout will fail
> due to where the problem happened, but the kernel NEEDS TO TRY.

As the primary author, I would like to clarify the motivation.

During LPC2022 at the printk proof-of-concept demonstration is where
this requirement came from. The second point in my summary [0] of that
meeting stated the new requirement.

The requirement came about because during the demonstration we
accidentally hit a situation where a warning backtrace could not be seen
because while trying to print the warning, a panic was hit. In the end
we could see the panic, but not the original warning. At that meeting we
genrally agreed that it would be better to at least get the backtrace
into the buffer before entering the complex machinery of pushing out the
backlog to consoles. Then, if a panic occurs while printing, the warning
is already in the buffer and will be flushed out ahead of any panic
messages. That discussion is available online [1] (starting at 56:20).

In your response [2] to my summary email, you mentioned that we could
tweak how much is buffered as well as possibly changing the print order
to get the important stuff out first. But it all relied on the ability
to get things into the buffer first without requiring each individual
printk() to synchronously push out the backlog to all consoles.

Petr's pull request provides the functionality for a CPU to call
printk() during emergencies so that each line only goes into the
buffer. We also include a function to perform the flush at any time. As
the series is implemented now, that flush happens after the warning is
completely stored into the buffer. In cases where there is lots of data
in the warning (such as in RCU stalls or lockdep splats), the flush
happens after significant parts of the warning.

John Ogness

[0] https://lore.kernel.org/lkml/875yheqh6v.fsf@jogness.linutronix.de

[1] https://www.youtube.com/watch?v=TVhNcKQvzxI (from 56:20)

[2] https://lore.kernel.org/lkml/CAHk-=wieXPMGEm7E=Sz2utzZdW1d=9hJBwGYAaAipxnMXr0Hvg@mail.gmail.com