lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aS7ApzJG8RPhtKh5@pathway.suse.cz>
Date: Tue, 2 Dec 2025 11:34:15 +0100
From: Petr Mladek <pmladek@...e.com>
To: John Ogness <john.ogness@...utronix.de>
Cc: Breno Leitao <leitao@...ian.org>, linux@...linux.org.uk,
	paulmck@...nel.org, usamaarif642@...il.com, leo.yan@....com,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	kernel-team@...a.com, rmikey@...a.com
Subject: Re: CSD lockup during kexec due to unbounded busy-wait in
 pl011_console_write_atomic (arm64)

On Mon 2025-12-01 14:27:32, John Ogness wrote:
> On 2025-12-01, John Ogness <john.ogness@...utronix.de> wrote:
> >> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
> >> index 3fa403f9831f..6b8becb6ecd9 100644
> >> --- a/kernel/printk/nbcon.c
> >> +++ b/kernel/printk/nbcon.c
> >> @@ -1549,6 +1549,7 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq)
> >>  {
> >>  	struct nbcon_write_context wctxt = { };
> >>  	struct nbcon_context *ctxt = &ACCESS_PRIVATE(&wctxt, ctxt);
> >> +	unsigned long flags;
> >>  	int err = 0;
> >>  
> >>  	ctxt->console			= con;
> >> @@ -1557,18 +1558,31 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq)
> >>  	ctxt->allow_unsafe_takeover	= nbcon_allow_unsafe_takeover();
> >>  
> >>  	while (nbcon_seq_read(con) < stop_seq) {
> >> -		if (!nbcon_context_try_acquire(ctxt, false))
> >> +		/*
> >> +		 * Atomic flushing does not use console driver synchronization
> >> +		 * (i.e. it does not hold the port lock for uart consoles).
> >> +		 * Therefore IRQs must be disabled to avoid being interrupted
> >> +		 * and then calling into a driver that will deadlock trying
> >> +		 * to acquire console ownership.
> >> +		 */
> >> +		local_irq_save(flags);
> >> +		if (!nbcon_context_try_acquire(ctxt, false)) {
> >> +			local_irq_restore(flags);
> >>  			return -EPERM;
> >> +		}
> >>  
> >>  		/*
> >>  		 * nbcon_emit_next_record() returns false when the console was
> >>  		 * handed over or taken over. In both cases the context is no
> >>  		 * longer valid.
> >>  		 */
> >> -		if (!nbcon_emit_next_record(&wctxt, true))
> >> +		if (!nbcon_emit_next_record(&wctxt, true)) {
> >> +			local_irq_restore(flags);
> >>  			return -EAGAIN;
> >> +		}
> >>  
> >>  		nbcon_context_release(ctxt);
> >> +		local_irq_restore(flags);
> >
> > Using local_irq_save()/_restore() here is not acceptable for PREEMPT_RT
> > because __nbcon_atomic_flush_pending_con() is also used by
> > nbcon_device_release().

Great catch! I did not think about this code path.

> After thinking about this more, this would be acceptable. If
> printk_get_console_flush_type() is reporting nbcon_atomic==true, then
> the system is in a state where latencies are irrelevant.

I agree. It might be possible to create a special variant for
the nbcon_device_release() code path. But it probably is not
worth it.

I am going to mention this in the commit message and send
it as proper patch.

Thanks a lot for review and feedback.

Best Regards,
Petr

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ