[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87bjkificj.fsf@jogness.linutronix.de>
Date: Mon, 01 Dec 2025 14:27:32 +0106
From: John Ogness <john.ogness@...utronix.de>
To: Petr Mladek <pmladek@...e.com>, Breno Leitao <leitao@...ian.org>
Cc: linux@...linux.org.uk, paulmck@...nel.org, usamaarif642@...il.com,
leo.yan@....com, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, kernel-team@...a.com, rmikey@...a.com
Subject: Re: CSD lockup during kexec due to unbounded busy-wait in
pl011_console_write_atomic (arm64)
On 2025-12-01, John Ogness <john.ogness@...utronix.de> wrote:
>> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
>> index 3fa403f9831f..6b8becb6ecd9 100644
>> --- a/kernel/printk/nbcon.c
>> +++ b/kernel/printk/nbcon.c
>> @@ -1549,6 +1549,7 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq)
>> {
>> struct nbcon_write_context wctxt = { };
>> struct nbcon_context *ctxt = &ACCESS_PRIVATE(&wctxt, ctxt);
>> + unsigned long flags;
>> int err = 0;
>>
>> ctxt->console = con;
>> @@ -1557,18 +1558,31 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq)
>> ctxt->allow_unsafe_takeover = nbcon_allow_unsafe_takeover();
>>
>> while (nbcon_seq_read(con) < stop_seq) {
>> - if (!nbcon_context_try_acquire(ctxt, false))
>> + /*
>> + * Atomic flushing does not use console driver synchronization
>> + * (i.e. it does not hold the port lock for uart consoles).
>> + * Therefore IRQs must be disabled to avoid being interrupted
>> + * and then calling into a driver that will deadlock trying
>> + * to acquire console ownership.
>> + */
>> + local_irq_save(flags);
>> + if (!nbcon_context_try_acquire(ctxt, false)) {
>> + local_irq_restore(flags);
>> return -EPERM;
>> + }
>>
>> /*
>> * nbcon_emit_next_record() returns false when the console was
>> * handed over or taken over. In both cases the context is no
>> * longer valid.
>> */
>> - if (!nbcon_emit_next_record(&wctxt, true))
>> + if (!nbcon_emit_next_record(&wctxt, true)) {
>> + local_irq_restore(flags);
>> return -EAGAIN;
>> + }
>>
>> nbcon_context_release(ctxt);
>> + local_irq_restore(flags);
>
> Using local_irq_save()/_restore() here is not acceptable for PREEMPT_RT
> because __nbcon_atomic_flush_pending_con() is also used by
> nbcon_device_release().
After thinking about this more, this would be acceptable. If
printk_get_console_flush_type() is reporting nbcon_atomic==true, then
the system is in a state where latencies are irrelevant.
John Ogness
Powered by blists - more mailing lists