[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84tt4wndu4.fsf@jogness.linutronix.de>
Date: Wed, 04 Jun 2025 09:58:19 +0206
From: John Ogness <john.ogness@...utronix.de>
To: "Toshiyuki Sato (Fujitsu)" <fj6611ie@...itsu.com>, 'Michael Kelley'
<mhklinux@...look.com>
Cc: "pmladek@...e.com" <pmladek@...e.com>, 'Ryo Takakura'
<ryotkkr98@...il.com>, Russell King <linux@...linux.org.uk>, Greg
Kroah-Hartman <gregkh@...uxfoundation.org>, Jiri Slaby
<jirislaby@...nel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-serial@...r.kernel.org"
<linux-serial@...r.kernel.org>, "linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "Toshiyuki Sato (Fujitsu)"
<fj6611ie@...itsu.com>
Subject: RE: Problem with nbcon console and amba-pl011 serial port
On 2025-06-04, "Toshiyuki Sato (Fujitsu)" <fj6611ie@...itsu.com> wrote:
> This is a proposed fix to force termination by returning false from
> nbcon_reacquire_nobuf when a panic occurs within pl011_console_write_thread.
> (I believe this is similar to what John suggested in his previous reply.)
>
> While I couldn't reproduce the issue using sysrq-trigger in my environment
> (It seemed that the panic was being executed before the thread processing),
> I did observe nbcon_reacquire_nobuf failing to complete when injecting an
> NMI (SError) during pl011_console_write_thread.
> Applying this fix seems to have resolved the "SMP: failed to stop secondary
> CPUs" issue.
>
> This patch is for test.
> Modifications to imx and other drivers, as well as adding __must_check,
> will likely be required.
>
> Michael, could you please test this fix in your environment?
>
> Regards,
> Toshiyuki Sato
>
> diff --git a/drivers/tty/serial/amba-pl011.c b/drivers/tty/serial/amba-pl011.c
> index 11d650975..c3a2b22e6 100644
> --- a/drivers/tty/serial/amba-pl011.c
> +++ b/drivers/tty/serial/amba-pl011.c
> @@ -2577,8 +2577,10 @@ pl011_console_write_thread(struct console *co, struct nbcon_write_context *wctxt
> }
> }
>
> - while (!nbcon_enter_unsafe(wctxt))
> - nbcon_reacquire_nobuf(wctxt);
> + while (!nbcon_enter_unsafe(wctxt)) {
I realize this is just a test patch. But for the real patch, there needs
to be some comment here about bailing out. On panic, the driver is
leaving the clock enabled and not restoring REG_CR. At a quick glance
this appears to be a bug. So the comment is needed to make it clear that
the driver is exiting on purpose without proper completion. And perhaps
also mentioning why this is safe during panic. Thanks.
> + if (!nbcon_reacquire_nobuf(wctxt))
> + return;
> + }
>
> while ((pl011_read(uap, REG_FR) ^ uap->vendor->inv_fr) & uap->vendor->fr_busy)
> cpu_relax();
John Ogness
Powered by blists - more mailing lists