lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CABXGCsMDOWXM8SQbmNsiXTh6ej87JKah3Wh_ze2dzG5mO5W98g@mail.gmail.com>
Date: Thu, 21 Aug 2025 11:56:00 +0500
From: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
To: sunjunchao@...edance.com, axboe@...nel.dk, nilay@...ux.ibm.com, 
	yukuai3@...wei.com, Ming Lei <ming.lei@...hat.com>, linux-block@...r.kernel.org, 
	Linux List Kernel Mailing <linux-kernel@...r.kernel.org>, 
	Linux regressions mailing list <regressions@...ts.linux.dev>
Subject: [REGRESSION] 6.17-rc2: lockdep circular dependency at boot introduced by 8f5845e0743b (“block: restore default wbt enablement”)

Hi,

After commit 8f5845e0743b (“block: restore default wbt enablement”)
I started seeing a lockdep warning about a circular locking dependency
on every boot.

Bisect
git bisect identifies 8f5845e0743bf3512b71b3cb8afe06c192d6acc4 as the
first bad commit.
Reverting this commit on top of 6.17.0-rc2-git-b19a97d57c15 makes the
warning disappear completely.

The warning looks like this:
[   12.595070] nvme nvme0: 32/0/0 default/read/poll queues
[   12.595566] nvme nvme1: 32/0/0 default/read/poll queues

[   12.610697] ======================================================
[   12.610705] WARNING: possible circular locking dependency detected
[   12.610714] 6.17.0-rc2-git-b19a97d57c15+ #158 Not tainted
[   12.610726] ------------------------------------------------------
[   12.610734] kworker/u129:3/911 is trying to acquire lock:
[   12.610743] ffffffff899ab700 (cpu_hotplug_lock){++++}-{0:0}, at:
static_key_slow_inc+0x16/0x40
[   12.610760]
               but task is already holding lock:
[   12.610769] ffff8881d166d570
(&q->q_usage_counter(io)#4){++++}-{0:0}, at:
blk_mq_freeze_queue_nomemsave+0x16/0x30
[   12.610787]
               which lock already depends on the new lock.

[   12.610798]
               the existing dependency chain (in reverse order) is:
[   12.610971]
               -> #2 (&q->q_usage_counter(io)#4){++++}-{0:0}:
[   12.611246]        __lock_acquire+0x56a/0xbe0
[   12.611381]        lock_acquire.part.0+0xc7/0x270
[   12.611518]        blk_alloc_queue+0x5cd/0x720
[   12.611649]        blk_mq_alloc_queue+0x143/0x250
[   12.611780]        __blk_mq_alloc_disk+0x18/0xd0
[   12.611906]        nvme_alloc_ns+0x240/0x1930 [nvme_core]
[   12.612042]        nvme_scan_ns+0x320/0x3b0 [nvme_core]
[   12.612170]        async_run_entry_fn+0x94/0x540
[   12.612289]        process_one_work+0x87a/0x14e0
[   12.612406]        worker_thread+0x5f2/0xfd0
[   12.612527]        kthread+0x3b0/0x770
[   12.612641]        ret_from_fork+0x3ef/0x510
[   12.612760]        ret_from_fork_asm+0x1a/0x30
[   12.612875]
               -> #1 (fs_reclaim){+.+.}-{0:0}:
[   12.613102]        __lock_acquire+0x56a/0xbe0
[   12.613215]        lock_acquire.part.0+0xc7/0x270
[   12.613327]        fs_reclaim_acquire+0xd9/0x130
[   12.613444]        __kmalloc_cache_node_noprof+0x60/0x4e0
[   12.613560]        amd_pmu_cpu_prepare+0x123/0x670
[   12.613674]        cpuhp_invoke_callback+0x2c8/0x9c0
[   12.613791]        __cpuhp_invoke_callback_range+0xbd/0x1f0
[   12.613904]        _cpu_up+0x2f8/0x6c0
[   12.614015]        cpu_up+0x11e/0x1c0
[   12.614124]        cpuhp_bringup_mask+0xea/0x130
[   12.614231]        bringup_nonboot_cpus+0xa9/0x170
[   12.614335]        smp_init+0x2b/0xf0
[   12.614443]        kernel_init_freeable+0x23f/0x2e0
[   12.614545]        kernel_init+0x1c/0x150
[   12.614643]        ret_from_fork+0x3ef/0x510
[   12.614744]        ret_from_fork_asm+0x1a/0x30
[   12.614840]
               -> #0 (cpu_hotplug_lock){++++}-{0:0}:
[   12.615029]        check_prev_add+0xe1/0xcf0
[   12.615126]        validate_chain+0x4cf/0x740
[   12.615221]        __lock_acquire+0x56a/0xbe0
[   12.615316]        lock_acquire.part.0+0xc7/0x270
[   12.615414]        cpus_read_lock+0x40/0xe0
[   12.615508]        static_key_slow_inc+0x16/0x40
[   12.615602]        rq_qos_add+0x264/0x440
[   12.615696]        wbt_init+0x3b2/0x510
[   12.615793]        blk_register_queue+0x334/0x470
[   12.615887]        __add_disk+0x5fd/0xd50
[   12.615980]        add_disk_fwnode+0x113/0x590
[   12.616073]        nvme_alloc_ns+0x7be/0x1930 [nvme_core]
[   12.616173]        nvme_scan_ns+0x320/0x3b0 [nvme_core]
[   12.616272]        async_run_entry_fn+0x94/0x540
[   12.616366]        process_one_work+0x87a/0x14e0
[   12.616464]        worker_thread+0x5f2/0xfd0
[   12.616558]        kthread+0x3b0/0x770
[   12.616651]        ret_from_fork+0x3ef/0x510
[   12.616749]        ret_from_fork_asm+0x1a/0x30
[   12.616841]
               other info that might help us debug this:

[   12.617108] Chain exists of:
                 cpu_hotplug_lock --> fs_reclaim --> &q->q_usage_counter(io)#4

[   12.617385]  Possible unsafe locking scenario:

[   12.617570]        CPU0                    CPU1
[   12.617662]        ----                    ----
[   12.617755]   lock(&q->q_usage_counter(io)#4);
[   12.617847]                                lock(fs_reclaim);
[   12.617940]                                lock(&q->q_usage_counter(io)#4);
[   12.618035]   rlock(cpu_hotplug_lock);
[   12.618129]
                *** DEADLOCK ***

[   12.618397] 7 locks held by kworker/u129:3/911:
[   12.618495]  #0: ffff8881083ba158
((wq_completion)async){+.+.}-{0:0}, at: process_one_work+0xe31/0x14e0
[   12.618692]  #1: ffffc900061b7d20
((work_completion)(&entry->work)){+.+.}-{0:0}, at:
process_one_work+0x7f9/0x14e0
[   12.618906]  #2: ffff888109c801a8
(&set->update_nr_hwq_lock){.+.+}-{4:4}, at: add_disk_fwnode+0xfd/0x590
[   12.619132]  #3: ffff8881d166dbb8 (&q->sysfs_lock){+.+.}-{4:4}, at:
blk_register_queue+0xdc/0x470
[   12.619257]  #4: ffff8881d166d798 (&q->rq_qos_mutex){+.+.}-{4:4},
at: wbt_init+0x39c/0x510
[   12.619383]  #5: ffff8881d166d570
(&q->q_usage_counter(io)#4){++++}-{0:0}, at:
blk_mq_freeze_queue_nomemsave+0x16/0x30
[   12.619640]  #6: ffff8881d166d5b0
(&q->q_usage_counter(queue)#4){+.+.}-{0:0}, at:
blk_mq_freeze_queue_nomemsave+0x16/0x30
[   12.619913]
               stack backtrace:
[   12.620171] CPU: 6 UID: 0 PID: 911 Comm: kworker/u129:3 Not tainted
6.17.0-rc2-git-b19a97d57c15+ #158 PREEMPT(lazy)
[   12.620173] Hardware name: ASRock B650I Lightning WiFi/B650I
Lightning WiFi, BIOS 3.30 06/16/2025
[   12.620174] Workqueue: async async_run_entry_fn
[   12.620177] Call Trace:
[   12.620178]  <TASK>
[   12.620179]  dump_stack_lvl+0x84/0xd0
[   12.620182]  print_circular_bug.cold+0x38/0x46
[   12.620185]  check_noncircular+0x14a/0x170
[   12.620187]  check_prev_add+0xe1/0xcf0
[   12.620189]  ? lock_acquire.part.0+0xc7/0x270
[   12.620191]  validate_chain+0x4cf/0x740
[   12.620193]  __lock_acquire+0x56a/0xbe0
[   12.620196]  lock_acquire.part.0+0xc7/0x270
[   12.620197]  ? static_key_slow_inc+0x16/0x40
[   12.620199]  ? rcu_is_watching+0x15/0xe0
[   12.620202]  ? __pfx___might_resched+0x10/0x10
[   12.620204]  ? static_key_slow_inc+0x16/0x40
[   12.620205]  ? lock_acquire+0xf6/0x140
[   12.620207]  cpus_read_lock+0x40/0xe0
[   12.620209]  ? static_key_slow_inc+0x16/0x40
[   12.620210]  static_key_slow_inc+0x16/0x40
[   12.620212]  rq_qos_add+0x264/0x440
[   12.620213]  wbt_init+0x3b2/0x510
[   12.620215]  ? wbt_enable_default+0x174/0x2b0
[   12.620217]  blk_register_queue+0x334/0x470
[   12.620218]  __add_disk+0x5fd/0xd50
[   12.620220]  ? wait_for_completion+0x17f/0x3c0
[   12.620222]  add_disk_fwnode+0x113/0x590
[   12.620224]  nvme_alloc_ns+0x7be/0x1930 [nvme_core]
[   12.620232]  ? __pfx_nvme_alloc_ns+0x10/0x10 [nvme_core]
[   12.620241]  ? __pfx_nvme_find_get_ns+0x10/0x10 [nvme_core]
[   12.620249]  ? __pfx_nvme_ns_info_from_identify+0x10/0x10 [nvme_core]
[   12.620257]  nvme_scan_ns+0x320/0x3b0 [nvme_core]
[   12.620264]  ? __pfx_nvme_scan_ns+0x10/0x10 [nvme_core]
[   12.620271]  ? __lock_release.isra.0+0x1cb/0x340
[   12.620273]  ? lockdep_hardirqs_on+0x8c/0x130
[   12.620275]  ? seqcount_lockdep_reader_access+0xb5/0xc0
[   12.620277]  ? seqcount_lockdep_reader_access+0xb5/0xc0
[   12.620279]  ? ktime_get+0x6a/0x180
[   12.620281]  async_run_entry_fn+0x94/0x540
[   12.620282]  process_one_work+0x87a/0x14e0
[   12.620285]  ? __pfx_process_one_work+0x10/0x10
[   12.620287]  ? local_clock_noinstr+0xf/0x130
[   12.620289]  ? assign_work+0x156/0x390
[   12.620291]  worker_thread+0x5f2/0xfd0
[   12.620294]  ? __pfx_worker_thread+0x10/0x10
[   12.620295]  kthread+0x3b0/0x770
[   12.620297]  ? local_clock_noinstr+0xf/0x130
[   12.620298]  ? __pfx_kthread+0x10/0x10
[   12.620300]  ? rcu_is_watching+0x15/0xe0
[   12.620301]  ? __pfx_kthread+0x10/0x10
[   12.620303]  ret_from_fork+0x3ef/0x510
[   12.620305]  ? __pfx_kthread+0x10/0x10
[   12.620306]  ? __pfx_kthread+0x10/0x10
[   12.620307]  ret_from_fork_asm+0x1a/0x30
[   12.620310]  </TASK>
[   12.628224]  nvme0n1: p1
[   12.628699]  nvme1n1: p1 p2 p3

It looks like enabling WBT by default causes wbt_init() → rq_qos_add()
to hit static_key_slow_inc(), which takes cpus_read_lock() (i.e.
cpu_hotplug_lock), while the worker already holds q_usage_counter(io),
creating the cycle reported by lockdep.

Environment / Repro:
    Hardware: ASRock B650I Lightning WiFi (NVMe), link to probe below
    Kernel: 6.17.0-rc2-git-b19a97d57c15 (self-built)
    Repro: occurs deterministically on every boot during NVMe namespace scan
    First bad commit: 8f5845e0743bf3512b71b3cb8afe06c192d6acc4
(“block: restore default wbt enablement”) — found by git bisect
    Fix/workaround: revert 8f5845e0743b

Attachments:
    Full dmesg (with the complete lockdep trace)
    .config
    Hardware probe: https://linux-hardware.org/?probe=9a6dd1ef4d

Happy to test any proposed patches or additional instrumentation.

Thanks for looking into it.

-- 
Best Regards,
Mike Gavrilov.

Download attachment "dmesg-6.17.0-rc2-git-b19a97d57c15.zip" of type "application/zip" (45945 bytes)

Download attachment ".config.zip" of type "application/zip" (70241 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ