linux-kernel - Re: [GIT PULL] printk for 6.11

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZqJKbcLgTeYRkDd6@pathway.suse.cz>
Date: Thu, 25 Jul 2024 14:51:57 +0200
From: Petr Mladek <pmladek@...e.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
	John Ogness <john.ogness@...utronix.de>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
	Rasmus Villemoes <linux@...musvillemoes.dk>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	Thomas Gleixner <tglx@...utronix.de>, Jan Kara <jack@...e.cz>,
	linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL] printk for 6.11

On Wed 2024-07-24 13:33:20, Linus Torvalds wrote:
> On Wed, 24 Jul 2024 at 05:47, Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > So.. I've complained about this emergency buffering before. At the very
> > least the atomic consoles should never buffer and immediately print
> > everything. Per their definition they always work.
> 
> Yeah, my personal preference would be some variation of this.
>
> And when I say "some variation of this", I do think that having a
> per-console trylock is fine, and buffering *if* the atomic console is
> already busy (presumably with an existing oops, but possibly also for
> "setup issues" - ie things like "serial line is being configured" or
> "VGACON is in the middle of a redraw or console size change
> operation".
> 
> And yes, before anybody speaks up, that is kind of the approximation
> of the current console_trylock() logic. I am aware. And I'm also aware
> of how much people have hated it. And I'm not claiming it's perfect.

I am afraid that we have to live with some buffering. Otherwhise,
the speed of the system might be limited by the speed of the consoles.
This might be especially noticeable during boot when a lot of HW
gets initialized and tons of messages are flushed to a slow serial console.

After all, the trylock trick has been added already in 2001. It has been
only 3 years after adding SMP support (console_lock) to consoles in 1998.

> But I do think that the *typically* important case is "something went
> horribly wrong, and the console was *not* busy at the time", and
> that's the case where there is no excuse to not just print out ASAP.

Yup.

Just for record. The idea of "buffering in emergency" came up
in the opposite scenario:

<flood of messages>

CPU 0					CPU 1

WARN()
  printk()
    flush_consoles()
      # handling long backlog

					panic()
					  printk()
					    flush_consoles()
					    # successfully took over the lock
					    # and continued flushing the backlog

Result: CPU 0 never printed the rest of the WARN()

It looked acceptable because WARN() code was just printing messages,
was well tested and should never fail (last famous words).

Another motivation was that the consoles were handled by separate
threads. They might allow to see the entire WARN() on fast consoles
before a serial one prints the first line.

Also there are ways to see the messages without working consoles,
e.g. via crash dump, pstore, persistent memory. The buffer-first
approach might make even more sense in this case.

> But I really do think that we should never buffer "by default". And
> that's why I kind of hate that whole concept of "oops_begin starts
> buffering". It's exactly the kind of "buffer by default" mental model
> that I was really hoping we'd never have.

I agree that buffering in emeregency is more risky than in normal
situation. The idea needs more love. Let's continue a more
conservative way for now.

John is going to rework the series and remove the buffering in
emeregency. I am going to send another pull request with
just few trivial fixes for 6.11.

Best Regards,
Petr