[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84tt0qeqqk.fsf@jogness.linutronix.de>
Date: Thu, 25 Sep 2025 13:12:11 +0206
From: John Ogness <john.ogness@...utronix.de>
To: Petr Mladek <pmladek@...e.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Jiri Slaby
<jirislaby@...nel.org>, Sergey Senozhatsky <senozhatsky@...omium.org>,
Steven Rostedt <rostedt@...dmis.org>, Thomas Gleixner
<tglx@...utronix.de>, Esben Haabendal <esben@...nix.com>,
linux-serial@...r.kernel.org, linux-kernel@...r.kernel.org, Andy
Shevchenko <andriy.shevchenko@...ux.intel.com>, Arnd Bergmann
<arnd@...db.de>, Tony Lindgren <tony@...mide.com>, Niklas Schnelle
<schnelle@...ux.ibm.com>, Serge Semin <fancer.lancer@...il.com>, Andrew
Murray <amurray@...goodpenguin.co.uk>
Subject: Re: [RFC 0/1] serial: 8250: nbcon_atomic_flush_pending() might
trigger watchdog warnigns
Hi Petr,
Thanks for putting together this summary...
On 2025-09-24, Petr Mladek <pmladek@...e.com> wrote:
> We currently have the following solutions for the original
> problem (hardlockup in nbcon_reacquire_nobuf()):
>
>
> 1. Touch the watchdog in nbcon_reacquire_nobuf()
>
> Pros:
> + trivial
>
> Cons:
> + Two CPUs might be blocked by slow serial consoles.
Note that nbcon_reacquire_nobuf() is not the only function that behaves
this way. We also have the same thing in the port lock wrapper:
__uart_port_nbcon_acquire().
> 2. Yield nbcon console context ownership between each record
> and block all kthreads from emergency_enter/exit API
>
> Pros:
> + Only one CPU is blocked by slow serial console
> + Prevents repeated takeovers for "every" new message
>
> Cons:
> + More complex than 1
> + Completely give up on parallel console handling in emergency
This seems like the most practical solution for now. It is simple and
will guarantee that no "kthread interference" occur. Note that only
consoles that implement write_atomic() need to have their kthread
blocked. (Also consoles with unsafe write_atomic() would not need to
have their kthread blocked.)
> 3. Yield nbcon console context ownership between each record
> and block only one kthread from __nbcon_atomic_flush_pending_con()
>
> Pros:
> + Only one CPU is blocked by slow serial console
> + Parallel console handling still possible in emergency
>
> Cons:
> + More complex than 1 (similar to 2)
> + Possible repeated takeovers for "every" new emergency message
IMHO this is the most complex solution and will still not guarantee
avoiding kthread interference.
> Well, releasing the console context ownership after each record
> might solve also some other problems
Sure, like with the port lock wrapper.
> I am going to try implementing the 3rd solution and see how
> complicated it would be.
>
> It would be possible to change it two 2nd easily just by
> using a global counter and updating it in emergency_enter/exit API.
Basically you are talking about changing the per-CPU emergency counter
to be global.
John
Powered by blists - more mailing lists