[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bf39250ebfb25160a5ce0abd9ce694f07cab8433.camel@gmx.de>
Date: Thu, 21 Aug 2025 12:06:15 +0200
From: Mike Galbraith <efault@....de>
To: Breno Leitao <leitao@...ian.org>
Cc: Pavel Begunkov <asml.silence@...il.com>, Jakub Kicinski
<kuba@...nel.org>, Johannes Berg <johannes@...solutions.net>,
paulmck@...nel.org, LKML <linux-kernel@...r.kernel.org>,
netdev@...r.kernel.org, boqun.feng@...il.com
Subject: Re: netconsole: HARDIRQ-safe -> HARDIRQ-unsafe lock order warning
On Thu, 2025-08-21 at 05:37 +0200, Mike Galbraith wrote:
> On Wed, 2025-08-20 at 10:36 -0700, Breno Leitao wrote:
> > On Wed, Aug 20, 2025 at 02:31:02PM +0200, Mike Galbraith wrote:
> > > On Tue, 2025-08-19 at 10:27 -0700, Breno Leitao wrote:
> > > >
> > > > I’ve continued investigating possible solutions, and it looks like
> > > > moving netconsole over to the non‑blocking console (nbcon) framework
> > > > might be the right approach. Unlike the classic console path, nbcon
> > > > doesn’t rely on the global console lock, which was one of the main
> > > > concerns regarding the possible deadlock.
> > >
> > > ATM at least, classic can remotely log a crash whereas nbcon can't per
> > > test drive, so it would be nice for classic to stick around until nbcon
> > > learns some atomic packet blasting.
> >
> > Oh, does it mean that during crash nbcon invokes `write_atomic` call
> > back, and because this patch doesn't implement it, it will not send
> > those pkts? Am I reading it correct?
>
> No, I'm just saying that the kernel's last gasp doesn't make it out of
> the box with CONFIG_NETCONSOLE_NBCON=y as your patch sits.
A quick test proved you correct as to the why.
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -1952,12 +1952,12 @@ static void netcon_write_thread(struct c
static void netconsole_device_lock(struct console *con, unsigned long *flags)
{
/* protects all the targets at the same time */
- spin_lock_irqsave(&target_list_lock, *flags);
+ spin_lock(&target_list_lock);
}
static void netconsole_device_unlock(struct console *con, unsigned long flags)
{
- spin_unlock_irqrestore(&target_list_lock, flags);
+ spin_unlock(&target_list_lock);
}
#endif
@@ -1966,6 +1966,7 @@ static struct console netconsole_ext = {
#ifdef CONFIG_NETCONSOLE_NBCON
.flags = CON_ENABLED | CON_EXTENDED | CON_NBCON,
.write_thread = netcon_write_ext_thread,
+ .write_atomic = netcon_write_ext_thread,
.device_lock = netconsole_device_lock,
.device_unlock = netconsole_device_unlock,
#else
@@ -1979,6 +1980,7 @@ static struct console netconsole = {
#ifdef CONFIG_NETCONSOLE_NBCON
.flags = CON_ENABLED | CON_NBCON,
.write_thread = netcon_write_thread,
+ .write_atomic = netcon_write_thread,
.device_lock = netconsole_device_lock,
.device_unlock = netconsole_device_unlock,
#else
...presto, wired desktop box now captures wireless lappy crash.
[ 48.378783] netconsole: network logging started
[ 77.329021] sysrq: Trigger a crash
[ 77.329392] Kernel panic - not syncing: sysrq triggered crash
[ 77.329556] ------------[ cut here ]------------
[ 77.329562] WARNING: CPU: 3 PID: 2452 at kernel/softirq.c:387 __local_bh_enable_ip+0x8f/0xe0
[ 77.329593] Modules linked in: netconsole ccm 8021q garp mrp af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs cmac algif_hash algif_skcipher af_alg iwlmvm snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_codec_generic mac80211 binfmt_misc libarc4 intel_rapl_msr uvcvideo intel_rapl_common snd_hda_intel uvc x86_pkg_temp_thermal snd_intel_dspcfg videobuf2_vmalloc snd_hda_codec intel_powerclamp videobuf2_memops videobuf2_v4l2 iwlwifi coretemp btusb iTCO_wdt snd_hwdep kvm_intel btrtl snd_hda_core videobuf2_common intel_pmc_bxt nls_iso8859_1 btbcm iTCO_vendor_support nls_cp437 mei_hdcp mfd_core btintel videodev snd_pcm cfg80211 kvm mc bluetooth snd_timer irqbypass i2c_i801 snd pcspkr mei_me rfkill soundcore i2c_smbus mei thermal battery acpi_pad ac button joydev nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc fuse dm_mod configfs dmi_sysfs hid_multitouch hid_generic usbhid i915 ghash_clmulni_intel i2c_algo_bit drm_buddy drm_client_lib video drm_display_helper xhci_pci xhci_hcd drm_kms_helper ahci ttm libahci
[ 77.329898] usbcore libata drm wmi usb_common serio_raw sd_mod scsi_dh_emc scsi_dh_rdac scsi_dh_alua sg scsi_mod scsi_common vfat fat virtio_blk virtio_mmio virtio virtio_ring ext4 crc16 mbcache jbd2 loop msr efivarfs autofs4 aesni_intel gf128mul
[ 77.329993] CPU: 3 UID: 0 PID: 2452 Comm: bash Kdump: loaded Tainted: G I 6.17.0.g068a56e5-master #231 PREEMPT(lazy)
[ 77.330011] Tainted: [I]=FIRMWARE_WORKAROUND
[ 77.330016] Hardware name: HP HP Spectre x360 Convertible/804F, BIOS F.47 11/22/2017
[ 77.330021] RIP: 0010:__local_bh_enable_ip+0x8f/0xe0
[ 77.330041] Code: 3e bf 01 00 00 00 e8 f0 68 03 00 e8 3b 75 14 00 fb 65 8b 05 ab af 9b 01 85 c0 74 41 5b 5d c3 65 8b 05 a1 e8 9b 01 85 c0 75 a4 <0f> 0b eb a0 e8 68 74 14 00 eb a1 48 89 ef e8 de c0 07 00 eb aa 48
[ 77.330050] RSP: 0018:ffff8881251bf898 EFLAGS: 00010046
[ 77.330061] RAX: 0000000000000000 RBX: 0000000000000201 RCX: ffff8881251bf854
[ 77.330069] RDX: 0000000000000001 RSI: 0000000000000201 RDI: ffffffffa136b870
[ 77.330075] RBP: ffffffffa136b870 R08: 0000000000000002 R09: ffffffff832b6820
[ 77.330081] R10: 0000000000000001 R11: 0000000000000000 R12: ffff888126da2168
[ 77.330088] R13: ffff8881221e8f00 R14: ffff888126da2000 R15: ffff8881221e8f20
[ 77.330095] FS: 00007f22ad9c3740(0000) GS:ffff88826130c000(0000) knlGS:0000000000000000
[ 77.330104] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 77.330111] CR2: 0000563c1913f2e0 CR3: 0000000111548006 CR4: 00000000003726f0
[ 77.330118] Call Trace:
[ 77.330124] <TASK>
[ 77.330139] ieee80211_queue_skb+0x140/0x350 [mac80211]
[ 77.330428] __ieee80211_xmit_fast+0x217/0x3a0 [mac80211]
[ 77.330698] ? __skb_get_hash_net+0x47/0x1c0
[ 77.330718] ? __skb_get_hash_net+0x47/0x1c0
[ 77.330768] ieee80211_xmit_fast+0xee/0x1e0 [mac80211]
[ 77.331012] __ieee80211_subif_start_xmit+0x141/0x390 [mac80211]
[ 77.331218] ? __lock_acquire+0x550/0xbc0
[ 77.331268] ieee80211_subif_start_xmit+0x39/0x200 [mac80211]
[ 77.331478] ? lock_acquire.part.0+0xa4/0x1e0
[ 77.331512] ? netif_skb_features+0xb6/0x2b0
[ 77.331535] netpoll_start_xmit+0x125/0x1a0
[ 77.331569] __netpoll_send_skb+0x309/0x310
[ 77.331594] ? netpoll_send_skb+0x24/0x80
[ 77.331618] netpoll_send_skb+0x42/0x80
[ 77.331644] netcon_write_thread+0xb3/0xe0 [netconsole]
[ 77.331684] nbcon_emit_next_record+0x25f/0x290
[ 77.331739] __nbcon_atomic_flush_pending_con+0x9a/0xf0
[ 77.331786] __nbcon_atomic_flush_pending+0xbc/0x130
[ 77.331822] vprintk_emit+0x258/0x540
[ 77.331866] _printk+0x4c/0x50
[ 77.331908] vpanic+0xb1/0x290
[ 77.331934] panic+0x4c/0x4c
[ 77.331956] ? rcu_read_unlock+0x17/0x60
[ 77.331993] sysrq_handle_crash+0x1a/0x20
[ 77.332011] __handle_sysrq.cold+0x8f/0xd4
[ 77.332037] write_sysrq_trigger+0x66/0x80
[ 77.332059] proc_reg_write+0x53/0x90
[ 77.332074] ? rcu_read_lock_any_held+0x6b/0xa0
[ 77.332090] vfs_write+0xcc/0x550
[ 77.332115] ? exc_page_fault+0x75/0x1e0
[ 77.332130] ? __lock_release.isra.0+0x54/0x140
[ 77.332150] ? exc_page_fault+0x75/0x1e0
[ 77.332167] ? exc_page_fault+0x75/0x1e0
[ 77.332199] ksys_write+0x5c/0xd0
[ 77.332228] do_syscall_64+0x76/0x3d0
[ 77.332260] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 77.332271] RIP: 0033:0x7f22ad721000
[ 77.332285] Code: 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 80 3d 09 ca 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
[ 77.332294] RSP: 002b:00007ffd75be2678 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 77.332306] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f22ad721000
[ 77.332313] RDX: 0000000000000002 RSI: 0000563c1913f2e0 RDI: 0000000000000001
[ 77.332319] RBP: 0000563c1913f2e0 R08: 0000000000000000 R09: 0000000000000000
[ 77.332326] R10: 00007f22ad610ea0 R11: 0000000000000202 R12: 0000000000000002
[ 77.332331] R13: 00007f22ad8005c0 R14: 00007f22ad7fdf60 R15: 0000563c19336af0
[ 77.332405] </TASK>
[ 77.332410] irq event stamp: 44121
[ 77.332415] hardirqs last enabled at (44119): [<ffffffff81351872>] __up_console_sem+0x52/0x60
[ 77.332429] hardirqs last disabled at (44120): [<ffffffff8120632a>] vpanic+0x3a/0x290
[ 77.332442] softirqs last enabled at (43324): [<ffffffff812cf84e>] handle_softirqs+0x31e/0x3f0
[ 77.332459] softirqs last disabled at (44121): [<ffffffff81aca754>] netpoll_send_skb+0x24/0x80
[ 77.332475] ---[ end trace 0000000000000000 ]---
[ 77.336439] CPU: 3 UID: 0 PID: 2452 Comm: bash Kdump: loaded Tainted: G W I 6.17.0.g068a56e5-master #231 PREEMPT(lazy)
[ 77.336507] Tainted: [W]=WARN, [I]=FIRMWARE_WORKAROUND
[ 77.336552] Hardware name: HP HP Spectre x360 Convertible/804F, BIOS F.47 11/22/2017
[ 77.336597] Call Trace:
[ 77.336646] <TASK>
[ 77.336705] dump_stack_lvl+0x5b/0x80
[ 77.336848] vpanic+0xca/0x290
[ 77.336968] panic+0x4c/0x4c
[ 77.337045] ? rcu_read_unlock+0x17/0x60
[ 77.337127] sysrq_handle_crash+0x1a/0x20
[ 77.337186] __handle_sysrq.cold+0x8f/0xd4
[ 77.337253] write_sysrq_trigger+0x66/0x80
[ 77.337315] proc_reg_write+0x53/0x90
[ 77.337373] ? rcu_read_lock_any_held+0x6b/0xa0
[ 77.337430] vfs_write+0xcc/0x550
[ 77.337494] ? exc_page_fault+0x75/0x1e0
[ 77.337550] ? __lock_release.isra.0+0x54/0x140
[ 77.337612] ? exc_page_fault+0x75/0x1e0
[ 77.338937] ? exc_page_fault+0x75/0x1e0
[ 77.339125] ksys_write+0x5c/0xd0
[ 77.339227] do_syscall_64+0x76/0x3d0
[ 77.339302] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 77.339355] RIP: 0033:0x7f22ad721000
[ 77.339410] Code: 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 80 3d 09 ca 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
[ 77.339460] RSP: 002b:00007ffd75be2678 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 77.339517] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f22ad721000
[ 77.339565] RDX: 0000000000000002 RSI: 0000563c1913f2e0 RDI: 0000000000000001
[ 77.339611] RBP: 0000563c1913f2e0 R08: 0000000000000000 R09: 0000000000000000
[ 77.339655] R10: 00007f22ad610ea0 R11: 0000000000000202 R12: 0000000000000002
[ 77.339700] R13: 00007f22ad8005c0 R14: 00007f22ad7fdf60 R15: 0000563c19336af0
[ 77.339807] </TASK>
The wireless stack now hates vpanic() for disabling IRQs, but that's
way better than death rattle not being transmitted.
-Mike
Powered by blists - more mailing lists