linux-kernel - Re: WARNING: CPU: 3 PID: 1 at block/blk-mq-cpumap.c:90 blk_mq_map_hw

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250122125445.37db5c38@gandalf.local.home>
Date: Wed, 22 Jan 2025 12:54:45 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Daniel Wagner <dwagner@...e.de>
Cc: Christoph Hellwig <hch@....de>, LKML <linux-kernel@...r.kernel.org>,
 Linus Torvalds <torvalds@...ux-foundation.org>, Daniel Wagner
 <wagi@...nel.org>, Hannes Reinecke <hare@...e.de>, Ming Lei
 <ming.lei@...hat.com>, John Garry <john.g.garry@...cle.com>, Jens Axboe
 <axboe@...nel.dk>
Subject: Re: WARNING: CPU: 3 PID: 1 at block/blk-mq-cpumap.c:90
 blk_mq_map_hw_queues+0xf3/0x100

On Wed, 22 Jan 2025 18:08:35 +0100
Daniel Wagner <dwagner@...e.de> wrote:

> fallback:
> 	WARN_ON_ONCE(qmap->nr_queues > 1);
> 	blk_mq_clear_mq_map(...)
> }

I commented out the WARN_ON_ONCE() to see if I could finish my testing,
but it now triggered this, but much later on in the tests:

[  813.092038] BUG: kernel NULL pointer dereference, address: 0000000000000090
[  813.094136] #PF: supervisor read access in kernel mode
[  813.095643] #PF: error_code(0x0000) - not-present page
[  813.095643] PGD 0 P4D 0
[  813.095643] Oops: Oops: 0000 [#1] PREEMPT SMP PTI
[  813.095643] CPU: 1 UID: 0 PID: 22 Comm: cpuhp/1 Not tainted 6.13.0-test-01253-g66611c047570-dirty #27
[  813.095643] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  813.095643] RIP: 0010:blk_mq_all_tag_iter+0x1a/0x270
[  813.095643] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 41 57 41 56 41 55 49 89 fd 41 54 55 53 48 83 ec 50 <48> 8b 87 90 00 00 00 65 4c 8b 04 25 28 00 00 00 4c 89 44 24 48 49
[  813.095643] RSP: 0018:ffffb668400d3da0 EFLAGS: 00010286
[  813.095643] RAX: 0000000000000000 RBX: ffffa1f0408e7200 RCX: 0000000000000000
[  813.095643] RDX: ffffb668400d3e28 RSI: ffffffffa1511d30 RDI: 0000000000000000
[  813.095643] RBP: ffffb668400d3e28 R08: ffffa1f0bbc9c528 R09: 0000000000000001
[  813.095643] R10: ffffa1f0408ec600 R11: 0000000000000001 R12: 0000000000000003
[  813.095643] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa1f0bbc9c528
[  813.095643] FS:  0000000000000000(0000) GS:ffffa1f0bbc80000(0000) knlGS:0000000000000000
[  813.095643] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  813.095643] CR2: 0000000000000090 CR3: 0000000104d42004 CR4: 0000000000170ef0
[  813.095643] Call Trace:
[  813.095643]  <TASK>
[  813.095643]  ? __die+0x56/0x97
[  813.095643]  ? page_fault_oops+0xbe/0x250
[  813.095643]  ? search_extable+0x26/0x30
[  813.095643]  ? blk_mq_all_tag_iter+0x1a/0x270
[  813.095643]  ? search_module_extables+0x19/0x60
[  813.095643]  ? exc_page_fault+0x227/0x6d0
[  813.095643]  ? affine_move_task+0x26f/0x510
[  813.095643]  ? asm_exc_page_fault+0x26/0x30
[  813.095643]  ? __pfx_blk_mq_has_request+0x10/0x10
[  813.095643]  ? blk_mq_all_tag_iter+0x1a/0x270
[  813.095643]  ? xas_load+0xd/0xd0
[  813.095643]  ? xa_load+0x7b/0xb0
[  813.095643]  blk_mq_hctx_notify_offline+0xf1/0x1a0
[  813.095643]  ? __pfx_blk_mq_hctx_notify_offline+0x10/0x10
[  813.095643]  cpuhp_invoke_callback+0x214/0x420
[  813.095643]  ? __pfx_smpboot_thread_fn+0x10/0x10
[  813.095643]  cpuhp_thread_fun+0x98/0x150
[  813.095643]  smpboot_thread_fn+0xdd/0x1d0
[  813.095643]  kthread+0xd2/0x100
[  813.095643]  ? __pfx_kthread+0x10/0x10
[  813.095643]  ret_from_fork+0x34/0x50
[  813.095643]  ? __pfx_kthread+0x10/0x10
[  813.095643]  ret_from_fork_asm+0x1a/0x30
[  813.095643]  </TASK>
[  813.095643] Modules linked in:
[  813.095643] CR2: 0000000000000090
[  813.095643] ---[ end trace 0000000000000000 ]---

It triggered after doing the mmiotrace which shuts down and brings up CPUs.

Not sure its related. I can see how reproducible this is, and if it is, I
can try to bisect it.

-- Steve