linux-kernel - 6.12 WARNING in netfs_consume_read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com>
Date: Tue, 19 Nov 2024 00:04:39 +0100
From: Max Kellermann <max.kellermann@...os.com>
To: David Howells <dhowells@...hat.com>, Jeff Layton <jlayton@...nel.org>, netfs@...ts.linux.dev, 
	linux-fsdevel <linux-fsdevel@...r.kernel.org>, linux-kernel@...r.kernel.org
Subject: 6.12 WARNING in netfs_consume_read_data()

Hi David & netfs developers,

I tried the new Linux kernel 6.12 today and it quickly reported a bug:

 ------------[ cut here ]------------
 WARNING: CPU: 13 PID: 0 at kernel/softirq.c:361 __local_bh_enable_ip+0x37/0x50
 Modules linked in:
 CPU: 13 UID: 0 PID: 0 Comm: swapper/13 Not tainted 6.12.0-cm4all1-hp+ #236
 Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS
U30 09/05/2019
 RIP: 0010:__local_bh_enable_ip+0x37/0x50
 Code: 00 0f 00 75 25 83 ee 01 f7 de 65 01 35 2a 93 d8 70 65 f7 05 1f
93 d8 70 00 ff ff 00 74 10 65 ff 0d 16 93 d8 70 c3 cc cc cc cc <0f> 0b
eb d7 65 66 83 3d 24 93 d8 70 00 74 e5 e8 45 ff ff ff eb de
 RSP: 0018:ffff979300464d30 EFLAGS: 00010006
 RAX: dead000000000122 RBX: ffff8c5cd1045b00 RCX: ffff8c5cdeff7800
 RDX: ffff8c5d022f91f0 RSI: 0000000000000200 RDI: ffffffff8f5ee1cc
 RBP: 0000000000000000 R08: 0000000000000002 R09: ffff8cba7cf6eb68
 R10: ffffffff910621e0 R11: 0000000000000001 R12: ffff8c5d022f9368
 R13: 0000000000001000 R14: ffff8c5d022f9368 R15: 0000000000001000
 FS:  0000000000000000(0000) GS:ffff8cba7cf40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000055ea9986d368 CR3: 000000603ee2e005 CR4: 00000000007706f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 PKRU: 55555554
 Call Trace:
  <IRQ>
  ? __warn+0x81/0x110
  ? __local_bh_enable_ip+0x37/0x50
  ? report_bug+0x14c/0x170
  ? handle_bug+0x53/0x90
  ? exc_invalid_op+0x13/0x60
  ? asm_exc_invalid_op+0x16/0x20
  ? netfs_consume_read_data.isra.0+0x3fc/0xa70
  ? __local_bh_enable_ip+0x37/0x50
  netfs_consume_read_data.isra.0+0x3fc/0xa70
  ? __pfx_cachefiles_read_complete+0x10/0x10
  netfs_read_subreq_terminated+0x265/0x360
  cachefiles_read_complete+0x45/0xf0
  iomap_dio_bio_end_io+0x122/0x160
  blk_update_request+0xf1/0x3e0
  scsi_end_request+0x27/0x190
  scsi_io_completion+0x43/0x6c0
  pqi_irq_handler+0x108/0xcd0
  __handle_irq_event_percpu+0x43/0x160
  handle_irq_event+0x27/0x70
  handle_edge_irq+0x82/0x220
  __common_interrupt+0x37/0xb0
  common_interrupt+0x74/0xa0
  </IRQ>
  <TASK>
  asm_common_interrupt+0x22/0x40
 RIP: 0010:cpuidle_enter_state+0xba/0x3c0
 Code: 00 e8 7a 06 1d ff e8 45 f7 ff ff 8b 53 04 49 89 c5 0f 1f 44 00
00 31 ff e8 83 38 1c ff 45 84 ff 0f 85 16 01 00 00 fb 45 85 f6 <0f> 88
76 01 00 00 48 8b 04 24 49 63 ce 48 6b d1 68 49 29 c5 48 89
 RSP: 0018:ffff97930015fe98 EFLAGS: 00000206
 RAX: ffff8cba7cf40000 RBX: ffffb792fe94f250 RCX: 000000000000001f
 RDX: 000000000000000d RSI: 0000000037c86db9 RDI: 0000000000000000
 RBP: 0000000000000003 R08: 0000000000000002 R09: 0000000000000020
 R10: 0000000000000003 R11: 0000000000000015 R12: ffffffff9125b220
 R13: 000006c7b80e3e03 R14: 0000000000000003 R15: 0000000000000000
  ? cpuidle_enter_state+0xad/0x3c0
  cpuidle_enter+0x29/0x40
  do_idle+0x19f/0x200
  cpu_startup_entry+0x25/0x30
  start_secondary+0xf3/0x100
  common_startup_64+0x13e/0x148
  </TASK>
 ---[ end trace 0000000000000000 ]---

Apparently, the netfs code doesn't want to be called from hardirq
context, but the cachefiles read completion callback may run in a
hardirq handler.

This source file is new in 6.12, so this regression may have been
caused by David Howell's commit ee4cdf7ba857 ("netfs: Speed up
buffered reading").

Max