linux-kernel - Re: [PATCH 3/3] printk/nbcon: Release nbcon consoles ownership in atomic flush after each emitted record

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALqELGzfY2L0z231Zt94-iuy5jE-+Lzjm2TqMgHrE3jsx-DCBQ@mail.gmail.com>
Date: Mon, 29 Sep 2025 09:38:57 +0100
From: Andrew Murray <amurray@...goodpenguin.co.uk>
To: Petr Mladek <pmladek@...e.com>
Cc: John Ogness <john.ogness@...utronix.de>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Jiri Slaby <jirislaby@...nel.org>, 
	Sergey Senozhatsky <senozhatsky@...omium.org>, Steven Rostedt <rostedt@...dmis.org>, 
	Thomas Gleixner <tglx@...utronix.de>, Esben Haabendal <esben@...nix.com>, linux-serial@...r.kernel.org, 
	linux-kernel@...r.kernel.org, 
	Andy Shevchenko <andriy.shevchenko@...ux.intel.com>, Arnd Bergmann <arnd@...db.de>, 
	Tony Lindgren <tony@...mide.com>, Niklas Schnelle <schnelle@...ux.ibm.com>, 
	Serge Semin <fancer.lancer@...il.com>
Subject: Re: [PATCH 3/3] printk/nbcon: Release nbcon consoles ownership in
 atomic flush after each emitted record

On Fri, 26 Sept 2025 at 13:50, Petr Mladek <pmladek@...e.com> wrote:
>
> printk() tries to flush messages with NBCON_PRIO_EMERGENCY on
> nbcon consoles immediately. It might take seconds to flush all
> pending lines on slow serial consoles. Note that there might be
> hundreds of messages, for example:
>
> [    3.771531][    T1] pci 0000:3e:08.1: [8086:324
> ** replaying previous printk message **
> [    3.771531][    T1] pci 0000:3e:08.1: [8086:3246] type 00 class 0x088000 PCIe Root Complex Integrated Endpoint
> [ ... more than 2000 lines, about 200kB messages ... ]
> [    3.837752][    T1] pci 0000:20:01.0: Adding to iommu group 18
> [    3.837851][    T
> ** replaying previous printk message **
> [    3.837851][    T1] pci 0000:20:03.0: Adding to iommu group 19
> [    3.837946][    T1] pci 0000:20:05.0: Adding to iommu group 20
> [ ... more than 500 messages for iommu groups 21-590 ...]
> [    3.912932][    T1] pci 0000:f6:00.1: Adding to iommu group 591
> [    3.913070][    T1] pci 0000:f6:00.2: Adding to iommu group 592
> [    3.913243][    T1] DMAR: Intel(R) Virtualization Technology for Directed I/O
> [    3.913245][    T1] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> [    3.913245][    T1] software IO TLB: mapped [mem 0x000000004f000000-0x0000000053000000] (64MB)
> [    3.913324][    T1] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 655360 ms ovfl timer
> [    3.913325][    T1] RAPL PMU: hw unit of domain package 2^-14 Joules
> [    3.913326][    T1] RAPL PMU: hw unit of domain dram 2^-14 Joules
> [    3.913327][    T1] RAPL PMU: hw unit of domain psys 2^-0 Joules
> [    3.933486][    T1] ------------[ cut here ]------------
> [    3.933488][    T1] WARNING: CPU: 2 PID: 1 at arch/x86/events/intel/uncore.c:1156 uncore_pci_pmu_register+0x15e/0x180
> [    3.930291][    C0] watchdog: Watchdog detected hard LOCKUP on cpu 0
> [    3.930291][    C0] Kernel panic - not syncing: Hard LOCKUP
> [...]
> [    3.930291][    C0] CPU: 0 UID: 0 PID: 18 Comm: pr/ttyS0 Not tainted...
> [...]
> [    3.930291][    C0] RIP: 0010:nbcon_reacquire_nobuf+0x11/0x50
> [    3.930291][    C0] Call Trace:
> [...]
> [    3.930291][    C0]  <TASK>
> [    3.930291][    C0]  serial8250_console_write+0x16d/0x5c0
> [    3.930291][    C0]  nbcon_emit_next_record+0x22c/0x250
> [    3.930291][    C0]  nbcon_emit_one+0x93/0xe0
> [    3.930291][    C0]  nbcon_kthread_func+0x13c/0x1c0
>
> The are visible two takeovers of the console ownership:
>
>   - The 1st one is triggered by the "WARNING: CPU: 2 PID: 1 at
>     arch/x86/..." line printed with NBCON_PRIO_EMERGENCY.
>
>   - The 2nd one is triggered by the "Kernel panic - not syncing:
>     Hard LOCKUP" line printed with NBCON_PRIO_PANIC.
>
> There are more than 2500 lines, at about 240kB, emitted between
> the takeover and the 1st "WARNING" line in the emergency context.
> This amount of pending messages had to be flushed by
> nbcon_atomic_flush_pending() when WARN() printed its first line.
>
> The atomic flush was holding the nbcon console context for too long so
> that it triggered hard lockup on the CPU running the printk kthread
> "pr/ttyS0". The kthread needed to reacquire the console ownership
> for restoring the original serial port state in serial8250_console_write().
>
> Prevent the hardlockup by releasing the nbcon console ownership after
> each emitted record.
>
> Note that __nbcon_atomic_flush_pending_con() used to hold the console
> ownership all the time because it blocked the printk kthread. Otherwise
> the kthread tried to flush the messages in parallel which caused repeated
> takeovers and more replayed messages.
>
> It is not longer a problem because the repeated takeovers are blocked
> by the counter of emergency contexts, see nbcon_cpu_emergency_cnt.
>
> Link: https://lore.kernel.org/all/aNQO-zl3k1l4ENfy@pathway.suse.cz
> Signed-off-by: Petr Mladek <pmladek@...e.com>
> ---
>  kernel/printk/nbcon.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
> index 219ae0c8b5ed..e298346111b2 100644
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1532,10 +1532,10 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
>         ctxt->prio                      = nbcon_get_default_prio();
>         ctxt->allow_unsafe_takeover     = allow_unsafe_takeover;
>
> -       if (!nbcon_context_try_acquire(ctxt, false))
> -               return -EPERM;
> -
>         while (nbcon_seq_read(con) < stop_seq) {
> +               if (!nbcon_context_try_acquire(ctxt, false))
> +                       return -EPERM;
> +
>                 /*
>                  * nbcon_emit_next_record() returns false when the console was
>                  * handed over or taken over. In both cases the context is no
> @@ -1544,6 +1544,8 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
>                 if (!nbcon_emit_next_record(&wctxt, true))
>                         return -EAGAIN;
>
> +               nbcon_context_release(ctxt);
> +
>                 if (!ctxt->backlog) {
>                         /* Are there reserved but not yet finalized records? */
>                         if (nbcon_seq_read(con) < stop_seq)
> @@ -1552,7 +1554,6 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
>                 }
>         }
>
> -       nbcon_context_release(ctxt);
>         return err;
>  }
>
> --
> 2.51.0
>

Reviewed-by: Andrew Murray <amurray@...goodpenguin.co.uk>

Thanks,

Andrew Murray