[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB4157A4C5E8CB219A75263A17D46DA@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Tue, 3 Jun 2025 03:18:23 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Toshiyuki Sato <fj6611ie@...jp.fujitsu.com>, Russell King
<linux@...linux.org.uk>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jiri Slaby <jirislaby@...nel.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-serial@...r.kernel.org" <linux-serial@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "fj6611ie@...jp.fujitsu.com"
<fj6611ie@...jp.fujitsu.com>
Subject: Problem with nbcon console and amba-pl011 serial port
I'm seeing a problem in the panic path of an ARM64 VM on Hyper-V
in the Azure public cloud. Here's the output from the VM's serial port:
# taskset -c 4 /bin/echo c >/proc/sysrq-trigger
[
** replaying previous printk message **
[ 51.616656] sysrq: Trigger a crash
[ 51.616689] Kernel panic - not syncing: sysrq triggered crash
[ 51.624212] CPU: 4 UID: 0 PID: 2278 Comm: echo Tainted: G E 6.15.0-rc7-next-20250521 #1 VOLUNTARY
[ 51.630165] Tainted: [E]=UNSIGNED_MODULE
[ 51.632331] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/28/2024
[ 51.638771] Call trace:
[ 51.640179] show_stack+0x20/0x38 (C)
[ 51.642488] dump_stack_lvl+0xc8/0xf8
[ 51.644638] dump_stack+0x18/0x28
[ 51.646654] panic+0x384/0x478
[ 51.648371] sysrq_handle_crash+0x20/0x28
[ 51.650815] __handle_sysrq+0xdc/0x2b8
[ 51.653080] write_sysrq_trigger+0x124/0x240
[ 51.655508] proc_reg_write+0xa4/0x100
[ 51.657917] vfs_write+0xd8/0x3e0
[ 51.659836] ksys_write+0x74/0x110
[ 51.661735] __arm64_sys_write+0x24/0x38
[ 51.663967] invoke_syscall+0x6c/0xf8
[ 51.666025] el0_svc_common.constprop.0+0xc8/0xf0
[ 51.668771] do_el0_svc+0x24/0x38
[ 51.670713] el0_svc+0x40/0x198
[ 51.672509] el0t_64_sync_handler+0xc8/0xd0
[ 51.675170] el0t_64_sync+0x1b0/0x1b8
[ 51.677351] SMP: stopping secondary CPUs
[ 52.728175] SMP: failed to stop secondary CPUs 2
[ 52.731229] Kernel Offset: 0x5706ebce0000 from 0xffff800080000000
[ 52.734528] PHYS_OFFSET: 0x0
[ 52.736115] CPU features: 0x2000,400007c0,02110ca1,5401faab
[ 52.739275] Memory Limit: none
[ 52.803615] ---[ end Kernel panic - not syncing: sysrq triggered crash ]---
The problem is the failure to stop secondary CPU 2. (The CPU # that fails
to stop varies from run-to-run.) It is mostly reproducible, but not always. I
bisected to commit 2eb2608618ce ("serial: amba-pl011: Implement nbcon
console") in the 6.15 kernel. Further custom logging shows the details of
why the problem can happen. Here are the steps:
1. The "echo" command ends up in __handle_sysrq(), which outputs the
"sysrq: Trigger a crash" message using pr_info(). I always ran the "echo"
command on CPU 4 just for consistency in my testing/debugging -- there's
nothing special about CPU 4.
2. The "pr/ttyAMA0" kernel thread handles the outputting of the message.
nbcon_kthread_func() calls nbcon_emit_one() with the "use_atomic" parameter
set to false. nbcon_emit_one() in turn calls nbcon_emit_next_record() with
the console spin lock held and interrupts disabled. nbcon_emit_next_record()
then calls pl011_console_write_thread(). The latter has a "for" loop to output
each character of the message, and my custom logging shows that it is indeed
outputting the string "[ 51.616656] sysrq: Trigger a crash".
3. While "pr/ttyAMA0" is doing its thing, __handle_sysrq() calls
sysrq_handle_crash(), which calls panic(). After some preliminaries, panic()
outputs the message "Kernel panic - not syncing: sysrq triggered crash"
using pr_emerg().
4. pr_emerg() has a high logging level, and it effectively steals the console
from the "pr/ttyAMA0" task, which I believe is intentional in the nbcon design.
Down in pl011_console_write_thread(), the "pr/ttyAMA0" task is doing
nbcon_enter_unsafe() and nbcon_exit_unsafe() around each character
that it outputs. When pr_emerg() steals the console, nbcon_exit_unsafe()
returns 0, so the "for" loop exits. pl011_console_write_thread() then
enters a busy "while" loop waiting to reclaim the console. It's doing this
busy "while" loop with interrupts disabled, and because of the panic,
it never succeeds. Whatever CPU is running "pr/ttyAMA0" is effectively
stuck at this point.
5. Meanwhile panic() continues, calling panic_other_cpus_shutdown(). On
ARM64, other CPUs are stopped by sending them an IPI. Each CPU receives
the IPI and calls the PSCI function to stop itself. But the CPU running
"pr/ttyAMA0" is looping forever with interrupts disabled, so it never
processes the IPI and it never stops. ARM64 doesn't have a true NMI that
can override the looping with interrupts disabled, so there's no way to
stop that CPU.
6. The failure to stop the "pr/ttyAMA0" CPU then causes downstream
problems, such as when loading and running a kdump kernel.
The problem is timing dependent. In some cases, the "pr/ttyAMA0"
thread is evidently able to get the message out before panic() calls
pr_emerg(). In my case running as a guest on Hyper-V, the pl011 device
in the guest VM is emulated by Hyper-V. Each pl011 register access
traps to the hypervisor, which slows things down and probably makes
the problem more likely. But from what I can see, the underlying
race condition exists regardless.
At this point, I just wanted to surface the problem. I don't have any
specific ideas on how to fix it, as my knowledge of nbcon is limited
to what I've learned in figuring out the failure path.
Toshiyuki -- what are your thoughts how to fix this?
Michael Kelley
Powered by blists - more mailing lists