lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMllXWKbn_5INlEt@pathway.suse.cz>
Date: Tue, 16 Sep 2025 15:25:49 +0200
From: Petr Mladek <pmladek@...e.com>
To: Breno Leitao <leitao@...ian.org>
Cc: John Ogness <john.ogness@...utronix.de>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Mike Galbraith <efault@....de>, linux-kernel@...r.kernel.org,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH printk v1 1/1] printk: nbcon: Allow unsafe write_atomic()
 for panic

On Mon 2025-09-15 08:46:13, Breno Leitao wrote:
> On Mon, Sep 15, 2025 at 04:20:35PM +0206, John Ogness wrote:
> > On 2025-09-15, Breno Leitao <leitao@...ian.org> wrote:
> > > On Fri, Sep 12, 2025 at 02:24:52PM +0206, John Ogness wrote:
> > >> @@ -1606,6 +1610,13 @@ static void __nbcon_atomic_flush_pending(u64 stop_seq, bool allow_unsafe_takeove
> > >>  		if (!console_is_usable(con, flags, true))
> > >>  			continue;
> > >>  
> > >> +		/*
> > >> +		 * It is only allowed to use unsafe ->write_atomic() from
> > >> +		 * nbcon_atomic_flush_unsafe().
> > >> +		 */
> > >> +		if ((flags & CON_NBCON_ATOMIC_UNSAFE) && !allow_unsafe_takeover)
> > >> +			continue;
> > >
> > > What will happen with the "message" in this case? is it lost?
> > >
> > > Let me clarify I understand the patch. The .write_atomic callback are
> > > called in two cases:
> > >
> > > 	1) Inside IRQ/NMI and scheduling context
> > > 	2) During panics.
> > >
> > > In both cases, they go throught __nbcon_atomic_flush_pending_con(),
> > > right?
> > 
> > @allow_unsafe_takeover is only true at the very end of panic. In all
> > other cases, the ->write_atomic() callback is ignored as if it wasn't
> > implemented. That means it will rely on the deferred printing kthread to
> > handle it.
> > 
> > > Let's say that netconsole implements CON_NBCON_ATOMIC_UNSAFE. What will
> > > happen with printks() inside IRQs (when the system is NOT panicking).
> > > Are they coming through __nbcon_atomic_flush_pending() and will be
> > > skipped?
> > >
> > > Also, are these messages even deferred for later flush?
> > 
> > When the system is not panicing, CON_NBCON_ATOMIC_UNSAFE has the effect
> > of acting as if you never implemented ->write_atomic(). So yes, only
> > ->write_thread() will handle everything in a deferred context. If the
> > system never panics, your ->write_atomic() will never be called.
> 
> If there is a printk() inside an IRQ and the host is not panicking, then
> the message will be deferred to the kthread, which will print through
> ->write_thread.

Just to be sure that we are all on the same page.

Note that the above statement is true for all NBCON consoles,
including serial consoles, even with the _safe_ .write_atomic().

In fact, printk() does not distinguish IRQ or task context. The
primary distinction are the following three priorities:

  + NBCON_PRIO_NORMAL is used when the system is working properly.

    In this case, all messages are deferred to the kthread when
    the kthread is available.


  + NBCON_PRIO_EMERGENCY is currently used in some situations,
    e.g. WARN(), RCU stall, or lockdep report.

    In this case, printk() tries to emit the messages directly using
    _safe_ .write_atomic(). It must be deferred to the kthread when
    the callback is not implemented or is not safe.


  + NBCON_PRIO_PANIC is used in panic() by the panic CPU.

    Note that only panic CPU is allowed to flush messages to be on the safe
    side when other CPUs are stopped. In fact, non-panic() CPUs are
    not even allowed to add new messages by default.

    Anyway, in panic(), printk() tries to flush the messages directly
    using _safe_ .write_atomic(). They are ignored when the callback
    is not implemented.

    This patch will allow to use the _unsafe_ . write_atomic() by
    the final "can't loose anything" nbcon_atomic_flush_unsafe()
    call before the CPU enters the final infinite loop (blinking LEDs).


Note that we use the term "priority" because the context with
the higher (more critical) priority is allowed to take over
the ownership of the console even in the middle of emitting
a message.

> So, from a user/netconsole perspective, assuming the no panic
> (allow_unsafe_takeover=false) all the messages will be transmitted
> (always from a thread context), even if the printk() happens on an IRQ.
> So, no message will be lost.

True.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ