linux-kernel - Re: [RFC 0/1] serial: 8250: nbcon_atomic_flush_pending() might trigger watchdog warnigns

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <84tt0qeqqk.fsf@jogness.linutronix.de>
Date: Thu, 25 Sep 2025 13:12:11 +0206
From: John Ogness <john.ogness@...utronix.de>
To: Petr Mladek <pmladek@...e.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Jiri Slaby
 <jirislaby@...nel.org>, Sergey Senozhatsky <senozhatsky@...omium.org>,
 Steven Rostedt <rostedt@...dmis.org>, Thomas Gleixner
 <tglx@...utronix.de>, Esben Haabendal <esben@...nix.com>,
 linux-serial@...r.kernel.org, linux-kernel@...r.kernel.org, Andy
 Shevchenko <andriy.shevchenko@...ux.intel.com>, Arnd Bergmann
 <arnd@...db.de>, Tony Lindgren <tony@...mide.com>, Niklas Schnelle
 <schnelle@...ux.ibm.com>, Serge Semin <fancer.lancer@...il.com>, Andrew
 Murray <amurray@...goodpenguin.co.uk>
Subject: Re: [RFC 0/1] serial: 8250: nbcon_atomic_flush_pending() might
 trigger watchdog warnigns

Hi Petr,

Thanks for putting together this summary...

On 2025-09-24, Petr Mladek <pmladek@...e.com> wrote:
> We currently have the following solutions for the original
> problem (hardlockup in nbcon_reacquire_nobuf()):
>
>
> 1. Touch the watchdog in nbcon_reacquire_nobuf()
>
>    Pros:
> 	+ trivial
>
>    Cons:
> 	+ Two CPUs might be blocked by slow serial consoles.

Note that nbcon_reacquire_nobuf() is not the only function that behaves
this way. We also have the same thing in the port lock wrapper:
__uart_port_nbcon_acquire().

> 2. Yield nbcon console context ownership between each record
>    and block all kthreads from emergency_enter/exit API
>
>    Pros:
> 	+ Only one CPU is blocked by slow serial console
> 	+ Prevents repeated takeovers for "every" new message
>
>    Cons:
> 	+ More complex than 1
> 	+ Completely give up on parallel console handling in emergency

This seems like the most practical solution for now. It is simple and
will guarantee that no "kthread interference" occur. Note that only
consoles that implement write_atomic() need to have their kthread
blocked. (Also consoles with unsafe write_atomic() would not need to
have their kthread blocked.)

> 3. Yield nbcon console context ownership between each record
>    and block only one kthread from __nbcon_atomic_flush_pending_con()
>
>    Pros:
> 	+ Only one CPU is blocked by slow serial console
> 	+ Parallel console handling still possible in emergency
>
>    Cons:
> 	+ More complex than 1   (similar to 2)
> 	+ Possible repeated takeovers for "every" new emergency message

IMHO this is the most complex solution and will still not guarantee
avoiding kthread interference.

> Well, releasing the console context ownership after each record
> might solve also some other problems

Sure, like with the port lock wrapper.

> I am going to try implementing the 3rd solution and see how
> complicated  it would be.
>
> It would be possible to change it two 2nd easily just by
> using a global counter and updating it in emergency_enter/exit API.

Basically you are talking about changing the per-CPU emergency counter
to be global.

John