[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aQNM3r6YU_4fl2Xx@pathway.suse.cz>
Date: Thu, 30 Oct 2025 12:32:46 +0100
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jiri Slaby <jirislaby@...nel.org>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Esben Haabendal <esben@...nix.com>, linux-serial@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
	Arnd Bergmann <arnd@...db.de>, Tony Lindgren <tony@...mide.com>,
	Niklas Schnelle <schnelle@...ux.ibm.com>,
	Serge Semin <fancer.lancer@...il.com>,
	Andrew Murray <amurray@...goodpenguin.co.uk>
Subject: Re: [PATCH 0/3] printk/nbcon: Prevent hardlockup reports caused by
 atomic nbcon flush
On Fri 2025-09-26 14:49:09, Petr Mladek wrote:
> This patchset should solve problem which was being discussed
> at https://lore.kernel.org/all/aNFR45fL2L4PavNc@pathway.suse.cz
> 
> __nbcon_atomic_flush_pending_con() preserves the nbcon console
> ownership all the time when flushing pending messages. It might
> take a long time with slow serial consoles.
> 
> It might trigger a hardlockup report on another CPU which is
> busy waiting for the nbcon console ownership, for example,
> in nbcon_reacquire_nobuf() or __uart_port_nbcon_acquire().
> 
> The problem is solved by the 3rd patch. It releases the console
> context ownership after each record.
> 
> The 3rd patch alone would increase the risk of takeovers and repeated
> lines. It is prevented by the 1st patch which blocks the printk kthread
> when any CPU is in an emergency context.
> 
> The 2nd patch allows to block the printk kthread also in panic.
> It is not important. It is just an obvious update of the check
> for emergency contexts.
> 
> Note: The patchset applies against current Linus' tree (v6.17-rc7).
> 
>       The 2nd patch would need an update after the consolisation of
>       the panic state API gets merged via -mm tree,
>       see https://lore.kernel.org/r/20250825022947.1596226-2-wangjinchao600@gmail.com
> 
> Petr Mladek (3):
>   printk/nbcon: Block printk kthreads when any CPU is in an emergency
>     context
>   printk/nbcon/panic: Allow printk kthread to sleep when the system is
>     in panic
>   printk/nbcon: Release nbcon consoles ownership in atomic flush after
>     each emitted record
> 
>  kernel/printk/internal.h |  1 +
>  kernel/printk/nbcon.c    | 43 +++++++++++++++++++++++++++++++++++-----
>  kernel/printk/printk.c   |  2 +-
>  3 files changed, 40 insertions(+), 6 deletions(-)
JFYI, the patchset has been comitted into printk/linux.git,
branch rework/atomic-flush-hardlockup[1].
It is queued for 6.19.
Note that I did the following modifications:
  + Added changes into the 1st patch proposed by John[2], namely:
     + initialize nbcon_cpu_emergency_cnt and make it static.
     + call nbcon_kthreads_wake() only when printk_get_console_flush_type()
       sets ft.nbcon_offload.
  + Rebased 2nd patch on top of 6.18-rc1 (panic_in_progress() moved to
    linux/panic.h).
[1] https://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git/log/?h=rework/atomic-flush-hardlockup
[2] https://lore.kernel.org/all/841pnti8k2.fsf@jogness.linutronix.de/
Best Regards,
Petr
PS: I thought about sending v2. But v1 already got enough Acks and
    I added the requested changes by cut&paste.
Powered by blists - more mailing lists
 
