lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bf39250ebfb25160a5ce0abd9ce694f07cab8433.camel@gmx.de>
Date: Thu, 21 Aug 2025 12:06:15 +0200
From: Mike Galbraith <efault@....de>
To: Breno Leitao <leitao@...ian.org>
Cc: Pavel Begunkov <asml.silence@...il.com>, Jakub Kicinski
 <kuba@...nel.org>,  Johannes Berg <johannes@...solutions.net>,
 paulmck@...nel.org, LKML <linux-kernel@...r.kernel.org>, 
 netdev@...r.kernel.org, boqun.feng@...il.com
Subject: Re: netconsole: HARDIRQ-safe -> HARDIRQ-unsafe lock order warning

On Thu, 2025-08-21 at 05:37 +0200, Mike Galbraith wrote:
> On Wed, 2025-08-20 at 10:36 -0700, Breno Leitao wrote:
> > On Wed, Aug 20, 2025 at 02:31:02PM +0200, Mike Galbraith wrote:
> > > On Tue, 2025-08-19 at 10:27 -0700, Breno Leitao wrote:
> > > > 
> > > > I’ve continued investigating possible solutions, and it looks like
> > > > moving netconsole over to the non‑blocking console (nbcon) framework
> > > > might be the right approach. Unlike the classic console path, nbcon
> > > > doesn’t rely on the global console lock, which was one of the main
> > > > concerns regarding the possible deadlock.
> > > 
> > > ATM at least, classic can remotely log a crash whereas nbcon can't per
> > > test drive, so it would be nice for classic to stick around until nbcon
> > > learns some atomic packet blasting.
> > 
> > Oh, does it mean that during crash nbcon invokes `write_atomic` call
> > back, and because this patch doesn't implement it, it will not send
> > those pkts? Am I reading it correct?
> 
> No, I'm just saying that the kernel's last gasp doesn't make it out of
> the box with CONFIG_NETCONSOLE_NBCON=y as your patch sits.

A quick test proved you correct as to the why.

--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -1952,12 +1952,12 @@ static void netcon_write_thread(struct c
 static void netconsole_device_lock(struct console *con, unsigned long *flags)
 {
 	/* protects all the targets at the same time */
-	spin_lock_irqsave(&target_list_lock, *flags);
+	spin_lock(&target_list_lock);
 }
 
 static void netconsole_device_unlock(struct console *con, unsigned long flags)
 {
-	spin_unlock_irqrestore(&target_list_lock, flags);
+	spin_unlock(&target_list_lock);
 }
 #endif
 
@@ -1966,6 +1966,7 @@ static struct console netconsole_ext = {
 #ifdef CONFIG_NETCONSOLE_NBCON
 	.flags	= CON_ENABLED | CON_EXTENDED | CON_NBCON,
 	.write_thread = netcon_write_ext_thread,
+	.write_atomic = netcon_write_ext_thread,
 	.device_lock = netconsole_device_lock,
 	.device_unlock = netconsole_device_unlock,
 #else
@@ -1979,6 +1980,7 @@ static struct console netconsole = {
 #ifdef CONFIG_NETCONSOLE_NBCON
 	.flags	= CON_ENABLED | CON_NBCON,
 	.write_thread = netcon_write_thread,
+	.write_atomic = netcon_write_thread,
 	.device_lock = netconsole_device_lock,
 	.device_unlock = netconsole_device_unlock,
 #else

...presto, wired desktop box now captures wireless lappy crash.

[   48.378783] netconsole: network logging started
[   77.329021] sysrq: Trigger a crash
[   77.329392] Kernel panic - not syncing: sysrq triggered crash
[   77.329556] ------------[ cut here ]------------
[   77.329562] WARNING: CPU: 3 PID: 2452 at kernel/softirq.c:387 __local_bh_enable_ip+0x8f/0xe0
[   77.329593] Modules linked in: netconsole ccm 8021q garp mrp af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs cmac algif_hash algif_skcipher af_alg iwlmvm snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_codec_generic mac80211 binfmt_misc libarc4 intel_rapl_msr uvcvideo intel_rapl_common snd_hda_intel uvc x86_pkg_temp_thermal snd_intel_dspcfg videobuf2_vmalloc snd_hda_codec intel_powerclamp videobuf2_memops videobuf2_v4l2 iwlwifi coretemp btusb iTCO_wdt snd_hwdep kvm_intel btrtl snd_hda_core videobuf2_common intel_pmc_bxt nls_iso8859_1 btbcm iTCO_vendor_support nls_cp437 mei_hdcp mfd_core btintel videodev snd_pcm cfg80211 kvm mc bluetooth snd_timer irqbypass i2c_i801 snd pcspkr mei_me rfkill soundcore i2c_smbus mei thermal battery acpi_pad ac button joydev nfsd sch_fq_codel auth_rpcgss nfs_acl lockd grace sunrpc fuse dm_mod configfs dmi_sysfs hid_multitouch hid_generic usbhid i915 ghash_clmulni_intel i2c_algo_bit drm_buddy drm_client_lib video drm_display_helper xhci_pci xhci_hcd drm_kms_helper ahci ttm libahci
[   77.329898]  usbcore libata drm wmi usb_common serio_raw sd_mod scsi_dh_emc scsi_dh_rdac scsi_dh_alua sg scsi_mod scsi_common vfat fat virtio_blk virtio_mmio virtio virtio_ring ext4 crc16 mbcache jbd2 loop msr efivarfs autofs4 aesni_intel gf128mul
[   77.329993] CPU: 3 UID: 0 PID: 2452 Comm: bash Kdump: loaded Tainted: G          I         6.17.0.g068a56e5-master #231 PREEMPT(lazy) 
[   77.330011] Tainted: [I]=FIRMWARE_WORKAROUND
[   77.330016] Hardware name: HP HP Spectre x360 Convertible/804F, BIOS F.47 11/22/2017
[   77.330021] RIP: 0010:__local_bh_enable_ip+0x8f/0xe0
[   77.330041] Code: 3e bf 01 00 00 00 e8 f0 68 03 00 e8 3b 75 14 00 fb 65 8b 05 ab af 9b 01 85 c0 74 41 5b 5d c3 65 8b 05 a1 e8 9b 01 85 c0 75 a4 <0f> 0b eb a0 e8 68 74 14 00 eb a1 48 89 ef e8 de c0 07 00 eb aa 48
[   77.330050] RSP: 0018:ffff8881251bf898 EFLAGS: 00010046
[   77.330061] RAX: 0000000000000000 RBX: 0000000000000201 RCX: ffff8881251bf854
[   77.330069] RDX: 0000000000000001 RSI: 0000000000000201 RDI: ffffffffa136b870
[   77.330075] RBP: ffffffffa136b870 R08: 0000000000000002 R09: ffffffff832b6820
[   77.330081] R10: 0000000000000001 R11: 0000000000000000 R12: ffff888126da2168
[   77.330088] R13: ffff8881221e8f00 R14: ffff888126da2000 R15: ffff8881221e8f20
[   77.330095] FS:  00007f22ad9c3740(0000) GS:ffff88826130c000(0000) knlGS:0000000000000000
[   77.330104] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   77.330111] CR2: 0000563c1913f2e0 CR3: 0000000111548006 CR4: 00000000003726f0
[   77.330118] Call Trace:
[   77.330124]  <TASK>
[   77.330139]  ieee80211_queue_skb+0x140/0x350 [mac80211]
[   77.330428]  __ieee80211_xmit_fast+0x217/0x3a0 [mac80211]
[   77.330698]  ? __skb_get_hash_net+0x47/0x1c0
[   77.330718]  ? __skb_get_hash_net+0x47/0x1c0
[   77.330768]  ieee80211_xmit_fast+0xee/0x1e0 [mac80211]
[   77.331012]  __ieee80211_subif_start_xmit+0x141/0x390 [mac80211]
[   77.331218]  ? __lock_acquire+0x550/0xbc0
[   77.331268]  ieee80211_subif_start_xmit+0x39/0x200 [mac80211]
[   77.331478]  ? lock_acquire.part.0+0xa4/0x1e0
[   77.331512]  ? netif_skb_features+0xb6/0x2b0
[   77.331535]  netpoll_start_xmit+0x125/0x1a0
[   77.331569]  __netpoll_send_skb+0x309/0x310
[   77.331594]  ? netpoll_send_skb+0x24/0x80
[   77.331618]  netpoll_send_skb+0x42/0x80
[   77.331644]  netcon_write_thread+0xb3/0xe0 [netconsole]
[   77.331684]  nbcon_emit_next_record+0x25f/0x290
[   77.331739]  __nbcon_atomic_flush_pending_con+0x9a/0xf0
[   77.331786]  __nbcon_atomic_flush_pending+0xbc/0x130
[   77.331822]  vprintk_emit+0x258/0x540
[   77.331866]  _printk+0x4c/0x50
[   77.331908]  vpanic+0xb1/0x290
[   77.331934]  panic+0x4c/0x4c
[   77.331956]  ? rcu_read_unlock+0x17/0x60
[   77.331993]  sysrq_handle_crash+0x1a/0x20
[   77.332011]  __handle_sysrq.cold+0x8f/0xd4
[   77.332037]  write_sysrq_trigger+0x66/0x80
[   77.332059]  proc_reg_write+0x53/0x90
[   77.332074]  ? rcu_read_lock_any_held+0x6b/0xa0
[   77.332090]  vfs_write+0xcc/0x550
[   77.332115]  ? exc_page_fault+0x75/0x1e0
[   77.332130]  ? __lock_release.isra.0+0x54/0x140
[   77.332150]  ? exc_page_fault+0x75/0x1e0
[   77.332167]  ? exc_page_fault+0x75/0x1e0
[   77.332199]  ksys_write+0x5c/0xd0
[   77.332228]  do_syscall_64+0x76/0x3d0
[   77.332260]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[   77.332271] RIP: 0033:0x7f22ad721000
[   77.332285] Code: 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 80 3d 09 ca 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
[   77.332294] RSP: 002b:00007ffd75be2678 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[   77.332306] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f22ad721000
[   77.332313] RDX: 0000000000000002 RSI: 0000563c1913f2e0 RDI: 0000000000000001
[   77.332319] RBP: 0000563c1913f2e0 R08: 0000000000000000 R09: 0000000000000000
[   77.332326] R10: 00007f22ad610ea0 R11: 0000000000000202 R12: 0000000000000002
[   77.332331] R13: 00007f22ad8005c0 R14: 00007f22ad7fdf60 R15: 0000563c19336af0
[   77.332405]  </TASK>
[   77.332410] irq event stamp: 44121
[   77.332415] hardirqs last  enabled at (44119): [<ffffffff81351872>] __up_console_sem+0x52/0x60
[   77.332429] hardirqs last disabled at (44120): [<ffffffff8120632a>] vpanic+0x3a/0x290
[   77.332442] softirqs last  enabled at (43324): [<ffffffff812cf84e>] handle_softirqs+0x31e/0x3f0
[   77.332459] softirqs last disabled at (44121): [<ffffffff81aca754>] netpoll_send_skb+0x24/0x80
[   77.332475] ---[ end trace 0000000000000000 ]---
[   77.336439] CPU: 3 UID: 0 PID: 2452 Comm: bash Kdump: loaded Tainted: G        W I         6.17.0.g068a56e5-master #231 PREEMPT(lazy) 
[   77.336507] Tainted: [W]=WARN, [I]=FIRMWARE_WORKAROUND
[   77.336552] Hardware name: HP HP Spectre x360 Convertible/804F, BIOS F.47 11/22/2017
[   77.336597] Call Trace:
[   77.336646]  <TASK>
[   77.336705]  dump_stack_lvl+0x5b/0x80
[   77.336848]  vpanic+0xca/0x290
[   77.336968]  panic+0x4c/0x4c
[   77.337045]  ? rcu_read_unlock+0x17/0x60
[   77.337127]  sysrq_handle_crash+0x1a/0x20
[   77.337186]  __handle_sysrq.cold+0x8f/0xd4
[   77.337253]  write_sysrq_trigger+0x66/0x80
[   77.337315]  proc_reg_write+0x53/0x90
[   77.337373]  ? rcu_read_lock_any_held+0x6b/0xa0
[   77.337430]  vfs_write+0xcc/0x550
[   77.337494]  ? exc_page_fault+0x75/0x1e0
[   77.337550]  ? __lock_release.isra.0+0x54/0x140
[   77.337612]  ? exc_page_fault+0x75/0x1e0
[   77.338937]  ? exc_page_fault+0x75/0x1e0
[   77.339125]  ksys_write+0x5c/0xd0
[   77.339227]  do_syscall_64+0x76/0x3d0
[   77.339302]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
[   77.339355] RIP: 0033:0x7f22ad721000
[   77.339410] Code: 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 80 3d 09 ca 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
[   77.339460] RSP: 002b:00007ffd75be2678 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[   77.339517] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f22ad721000
[   77.339565] RDX: 0000000000000002 RSI: 0000563c1913f2e0 RDI: 0000000000000001
[   77.339611] RBP: 0000563c1913f2e0 R08: 0000000000000000 R09: 0000000000000000
[   77.339655] R10: 00007f22ad610ea0 R11: 0000000000000202 R12: 0000000000000002
[   77.339700] R13: 00007f22ad8005c0 R14: 00007f22ad7fdf60 R15: 0000563c19336af0
[   77.339807]  </TASK>

The wireless stack now hates vpanic() for disabling IRQs, but that's
way better than death rattle not being transmitted.

	-Mike


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ