linux-kernel - Re: [PATCH printk v4 27/27] lockdep: Mark emergency sections in lockdep splats

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <874jc1lb9r.fsf@jogness.linutronix.de>
Date: Tue, 16 Apr 2024 14:59:04 +0206
From: John Ogness <john.ogness@...utronix.de>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Petr Mladek <pmladek@...e.com>, Sergey Senozhatsky
 <senozhatsky@...omium.org>, Steven Rostedt <rostedt@...dmis.org>, Thomas
 Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org, Ingo Molnar
 <mingo@...hat.com>, Will Deacon <will@...nel.org>, Waiman Long
 <longman@...hat.com>, Boqun Feng <boqun.feng@...il.com>
Subject: Re: [PATCH printk v4 27/27] lockdep: Mark emergency sections in
 lockdep splats

On 2024-04-16, Peter Zijlstra <peterz@...radead.org> wrote:
>> Mark emergency sections wherever multiple lines of
>> lock debugging output are generated. In an emergency
>> section the CPU will not perform console output for the
>> printk() calls. Instead, a flushing of the console
>> output is triggered when exiting the emergency section.
>> This allows the full message block to be stored as
>> quickly as possible in the ringbuffer.
>
> I am confused, when in emergency I want the thing to dump everything to
> the atomic thing asap.

At LPC 2022 we discussed this and agreed that it is more appropriate to
get the full set of emergency printk messages into the ringbuffer
first. This is fast and lockless. Then we can begin the slow process of
printing.

Perhaps I am making these emergency sections too large. Maybe only
certain backtraces should use them. We will need to gain some experience
here.

> Storing it all up runs the risk of never getting to the 'complete' point
> because we're dead.

If no other debugging mechanisms are available to read the ringbuffer
and no other CPUs are alive to handle the printing: yes, we are
dead. But if the machine is dying so quickly, it would probably die
before getting the first line out anyway.

Note that by "dead" we mean an irresponsive system that does not panic.

John