linux-kernel - Re: Problem with nbcon console and amba-pl011 serial port

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aEBNLMYVUOGzusuR@pathway.suse.cz>
Date: Wed, 4 Jun 2025 15:42:04 +0200
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: "Toshiyuki Sato (Fujitsu)" <fj6611ie@...itsu.com>,
	'Michael Kelley' <mhklinux@...look.com>,
	'Ryo Takakura' <ryotkkr98@...il.com>,
	Russell King <linux@...linux.org.uk>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jiri Slaby <jirislaby@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-serial@...r.kernel.org" <linux-serial@...r.kernel.org>,
	"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>
Subject: Re: Problem with nbcon console and amba-pl011 serial port

On Wed 2025-06-04 13:56:34, John Ogness wrote:
> On 2025-06-04, Petr Mladek <pmladek@...e.com> wrote:
> > On Wed 2025-06-04 04:11:10, Toshiyuki Sato (Fujitsu) wrote:
> >> > On 2025-06-03, John Ogness <john.ogness@...utronix.de> wrote:
> >> > > On 2025-06-03, "Toshiyuki Sato (Fujitsu)" <fj6611ie@...itsu.com> wrote:
> >> > >>> 4. pr_emerg() has a high logging level, and it effectively steals the console
> >> > >>> from the "pr/ttyAMA0" task, which I believe is intentional in the nbcon
> >> > design.
> >> > >>> Down in pl011_console_write_thread(), the "pr/ttyAMA0" task is doing
> >> > >>> nbcon_enter_unsafe() and nbcon_exit_unsafe() around each character
> >> > >>> that it outputs.  When pr_emerg() steals the console, nbcon_exit_unsafe()
> >> > >>> returns 0, so the "for" loop exits. pl011_console_write_thread() then
> >> > >>> enters a busy "while" loop waiting to reclaim the console. It's doing this
> >> > >>> busy "while" loop with interrupts disabled, and because of the panic,
> >> > >>> it never succeeds.
> >
> > I am a bit surprised that it never succeeds. The panic CPU takes over
> > the ownership but it releases it when the messages are flushed. And
> > the original owner should be able to reacquire it in this case.
> 
> The problem is that other_cpu_in_panic() will return true forever, which
> will cause _all_ acquires to fail forever. Originally we did allow
> non-panic to take over again after panic releases ownership. But IIRC we
> removed that capability because it allowed us to reduce a lot of
> complexity. And now nbcon_waiter_matches() relies on "Lower priorities
> are ignored during panic() until reboot."

Great catch! I forgot it. And it explains everything.

It would be nice to mention this in the commit message or
in the comment above nbcon_reacquire_nobuf().

My updated prosal of the comment is:

 * Return:	True when the context reacquired the owner ship. The caller
 *		might try entering the unsafe state and restore the original
 *		console device setting. It must not access the output buffer
 *		anymore.
 *
 *		False when another CPU is in panic(). nbcon_try_acquire()
 *		would never succeed and the infinite loop would	prevent
 *		stopping this CPU on architectures without proper NMI.
 *		The caller should bail out immediately without
 *		touching the console device or the output buffer.

Best Regards,
Petr