[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250122125445.37db5c38@gandalf.local.home>
Date: Wed, 22 Jan 2025 12:54:45 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Daniel Wagner <dwagner@...e.de>
Cc: Christoph Hellwig <hch@....de>, LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>, Daniel Wagner
<wagi@...nel.org>, Hannes Reinecke <hare@...e.de>, Ming Lei
<ming.lei@...hat.com>, John Garry <john.g.garry@...cle.com>, Jens Axboe
<axboe@...nel.dk>
Subject: Re: WARNING: CPU: 3 PID: 1 at block/blk-mq-cpumap.c:90
blk_mq_map_hw_queues+0xf3/0x100
On Wed, 22 Jan 2025 18:08:35 +0100
Daniel Wagner <dwagner@...e.de> wrote:
> fallback:
> WARN_ON_ONCE(qmap->nr_queues > 1);
> blk_mq_clear_mq_map(...)
> }
I commented out the WARN_ON_ONCE() to see if I could finish my testing,
but it now triggered this, but much later on in the tests:
[ 813.092038] BUG: kernel NULL pointer dereference, address: 0000000000000090
[ 813.094136] #PF: supervisor read access in kernel mode
[ 813.095643] #PF: error_code(0x0000) - not-present page
[ 813.095643] PGD 0 P4D 0
[ 813.095643] Oops: Oops: 0000 [#1] PREEMPT SMP PTI
[ 813.095643] CPU: 1 UID: 0 PID: 22 Comm: cpuhp/1 Not tainted 6.13.0-test-01253-g66611c047570-dirty #27
[ 813.095643] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 813.095643] RIP: 0010:blk_mq_all_tag_iter+0x1a/0x270
[ 813.095643] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 41 57 41 56 41 55 49 89 fd 41 54 55 53 48 83 ec 50 <48> 8b 87 90 00 00 00 65 4c 8b 04 25 28 00 00 00 4c 89 44 24 48 49
[ 813.095643] RSP: 0018:ffffb668400d3da0 EFLAGS: 00010286
[ 813.095643] RAX: 0000000000000000 RBX: ffffa1f0408e7200 RCX: 0000000000000000
[ 813.095643] RDX: ffffb668400d3e28 RSI: ffffffffa1511d30 RDI: 0000000000000000
[ 813.095643] RBP: ffffb668400d3e28 R08: ffffa1f0bbc9c528 R09: 0000000000000001
[ 813.095643] R10: ffffa1f0408ec600 R11: 0000000000000001 R12: 0000000000000003
[ 813.095643] R13: 0000000000000000 R14: 0000000000000000 R15: ffffa1f0bbc9c528
[ 813.095643] FS: 0000000000000000(0000) GS:ffffa1f0bbc80000(0000) knlGS:0000000000000000
[ 813.095643] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 813.095643] CR2: 0000000000000090 CR3: 0000000104d42004 CR4: 0000000000170ef0
[ 813.095643] Call Trace:
[ 813.095643] <TASK>
[ 813.095643] ? __die+0x56/0x97
[ 813.095643] ? page_fault_oops+0xbe/0x250
[ 813.095643] ? search_extable+0x26/0x30
[ 813.095643] ? blk_mq_all_tag_iter+0x1a/0x270
[ 813.095643] ? search_module_extables+0x19/0x60
[ 813.095643] ? exc_page_fault+0x227/0x6d0
[ 813.095643] ? affine_move_task+0x26f/0x510
[ 813.095643] ? asm_exc_page_fault+0x26/0x30
[ 813.095643] ? __pfx_blk_mq_has_request+0x10/0x10
[ 813.095643] ? blk_mq_all_tag_iter+0x1a/0x270
[ 813.095643] ? xas_load+0xd/0xd0
[ 813.095643] ? xa_load+0x7b/0xb0
[ 813.095643] blk_mq_hctx_notify_offline+0xf1/0x1a0
[ 813.095643] ? __pfx_blk_mq_hctx_notify_offline+0x10/0x10
[ 813.095643] cpuhp_invoke_callback+0x214/0x420
[ 813.095643] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 813.095643] cpuhp_thread_fun+0x98/0x150
[ 813.095643] smpboot_thread_fn+0xdd/0x1d0
[ 813.095643] kthread+0xd2/0x100
[ 813.095643] ? __pfx_kthread+0x10/0x10
[ 813.095643] ret_from_fork+0x34/0x50
[ 813.095643] ? __pfx_kthread+0x10/0x10
[ 813.095643] ret_from_fork_asm+0x1a/0x30
[ 813.095643] </TASK>
[ 813.095643] Modules linked in:
[ 813.095643] CR2: 0000000000000090
[ 813.095643] ---[ end trace 0000000000000000 ]---
It triggered after doing the mmiotrace which shuts down and brings up CPUs.
Not sure its related. I can see how reproducible this is, and if it is, I
can try to bisect it.
-- Steve
Powered by blists - more mailing lists