[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202509212119.eab661a8-lkp@intel.com>
Date: Sun, 21 Sep 2025 21:29:34 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
<aubrey.li@...ux.intel.com>, <yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: [peterz-queue:sched/cleanup] [sched] cfcabf4524:
WARNING:possible_recursive_locking_detected
Hello,
kernel test robot noticed "WARNING:possible_recursive_locking_detected" on:
commit: cfcabf45249df741fa733f41f7dbf98534e31b6b ("sched: Fix do_set_cpus_allowed() locking")
https://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git sched/cleanup
in testcase: locktorture
version:
with following parameters:
runtime: 300s
test: cpuhotplug
config: x86_64-randconfig-076-20250917
compiler: clang-20
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202509212119.eab661a8-lkp@intel.com
[ 95.960389][ T23]
[ 95.961013][ T23] ============================================
[ 95.961890][ T23] WARNING: possible recursive locking detected
[ 95.962817][ T23] 6.17.0-rc4-00016-gcfcabf45249d #1 Tainted: G T
[ 95.967338][ T23] --------------------------------------------
[ 95.976369][ T23] migration/1/23 is trying to acquire lock:
[ 95.977282][ T23] ffff8883a9dfa198 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested (kernel/sched/core.c:638)
[ 95.978743][ T23]
[ 95.978743][ T23] but task is already holding lock:
[ 95.979934][ T23] ffff8883a9dfa198 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested (kernel/sched/core.c:638)
[ 95.981409][ T23]
[ 95.981409][ T23] other info that might help us debug this:
[ 95.982674][ T23] Possible unsafe locking scenario:
[ 95.982674][ T23]
[ 95.984064][ T23] CPU0
[ 95.984613][ T23] ----
[ 95.985206][ T23] lock(&rq->__lock);
[ 95.985896][ T23] lock(&rq->__lock);
[ 95.986590][ T23]
[ 95.986590][ T23] *** DEADLOCK ***
[ 95.986590][ T23]
[ 95.988030][ T23] May be due to missing lock nesting notation
[ 95.988030][ T23]
[ 95.989277][ T23] 3 locks held by migration/1/23:
[ 95.990078][ T23] #0: ffff888175d905b8 (&p->pi_lock){-.-.}-{2:2}, at: __balance_push_cpu_stop (kernel/sched/sched.h:1520 kernel/sched/sched.h:1847 kernel/sched/core.c:8098)
[ 95.991598][ T23] #1: ffff8883a9dfa198 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested (kernel/sched/core.c:638)
[ 95.993052][ T23] #2: ffffffff8cd48ba0 (rcu_read_lock){....}-{1:3}, at: cpuset_cpus_allowed_fallback (include/linux/rcupdate.h:331 include/linux/rcupdate.h:841 kernel/cgroup/cpuset.c:4122)
[ 95.996014][ T23]
[ 95.996014][ T23] stack backtrace:
[ 95.996988][ T23] CPU: 1 UID: 0 PID: 23 Comm: migration/1 Tainted: G T 6.17.0-rc4-00016-gcfcabf45249d #1 PREEMPT
[ 95.996998][ T23] Tainted: [T]=RANDSTRUCT
[ 95.997001][ T23] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 95.997005][ T23] Stopper: __balance_push_cpu_stop+0x0/0x320 <- balance_push (kernel/sched/core.c:8177)
[ 95.997018][ T23] Call Trace:
[ 95.997022][ T23] <TASK>
[ 95.997027][ T23] __dump_stack (lib/dump_stack.c:95)
[ 95.997034][ T23] dump_stack_lvl (lib/dump_stack.c:123)
[ 95.997041][ T23] dump_stack (lib/dump_stack.c:130)
[ 95.997046][ T23] print_deadlock_bug (kernel/locking/lockdep.c:3043)
[ 95.997054][ T23] __lock_acquire (kernel/locking/lockdep.c:?)
[ 95.997062][ T23] ? kvm_sched_clock_read (arch/x86/kernel/kvmclock.c:91)
[ 95.997070][ T23] ? sched_clock_noinstr (arch/x86/kernel/tsc.c:271)
[ 95.997080][ T23] lock_acquire (kernel/locking/lockdep.c:5868)
[ 95.997085][ T23] ? raw_spin_rq_lock_nested (kernel/sched/core.c:638)
[ 95.997091][ T23] ? __lock_acquire (kernel/locking/lockdep.c:?)
[ 95.997096][ T23] ? kvm_sched_clock_read (arch/x86/kernel/kvmclock.c:91)
[ 95.997102][ T23] ? sched_clock_noinstr (arch/x86/kernel/tsc.c:271)
[ 95.997109][ T23] ? raw_spin_rq_lock_nested (kernel/sched/core.c:638)
[ 95.997114][ T23] _raw_spin_lock_nested (kernel/locking/spinlock.c:378)
[ 95.997121][ T23] ? raw_spin_rq_lock_nested (kernel/sched/core.c:638)
[ 95.997127][ T23] raw_spin_rq_lock_nested (kernel/sched/core.c:638)
[ 95.997133][ T23] __task_rq_lock (include/linux/sched.h:2226)
[ 95.997141][ T23] do_set_cpus_allowed (kernel/sched/sched.h:1825 kernel/sched/core.c:2742)
[ 95.997149][ T23] ? cpuset_cpus_allowed_fallback (include/linux/rcupdate.h:331 include/linux/rcupdate.h:841 kernel/cgroup/cpuset.c:4122)
[ 95.997157][ T23] cpuset_cpus_allowed_fallback (kernel/cgroup/cpuset.c:?)
[ 95.997164][ T23] select_fallback_rq (kernel/sched/core.c:?)
[ 95.997171][ T23] __balance_push_cpu_stop (kernel/sched/core.c:8103)
[ 95.997178][ T23] ? __do_trace_sched_move_numa (kernel/sched/core.c:8091)
[ 95.997183][ T23] cpu_stopper_thread (kernel/stop_machine.c:513)
[ 95.997192][ T23] ? cpu_stop_should_run (kernel/stop_machine.c:488)
[ 95.997200][ T23] smpboot_thread_fn (kernel/smpboot.c:?)
[ 95.997210][ T23] ? smpboot_thread_fn (kernel/smpboot.c:?)
[ 95.997218][ T23] kthread (kernel/kthread.c:465)
[ 95.997225][ T23] ? smpboot_unregister_percpu_thread (kernel/smpboot.c:103)
[ 95.997233][ T23] ? __do_trace_sched_kthread_stop_ret (kernel/kthread.c:412)
[ 95.997240][ T23] ret_from_fork (arch/x86/kernel/process.c:154)
[ 95.997247][ T23] ? __do_trace_sched_kthread_stop_ret (kernel/kthread.c:412)
[ 95.997254][ T23] ret_from_fork_asm (arch/x86/entry/entry_64.S:255)
[ 95.997263][ T23] </TASK>
[ 155.023652][ C0] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 57s!
[ 155.059744][ C0] Showing busy workqueues and worker pools:
[ 155.064892][ C0] workqueue events: flags=0x0
[ 155.065729][ C0] pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=3 refcnt=5
[ 155.065748][ C0] in-flight: 9:work_for_cpu_fn BAR(476) ,10:vmstat_shepherd
[ 155.065802][ C0] pending: e1000_watchdog
[ 155.065820][ C0] workqueue events_unbound: flags=0x2
[ 155.069920][ C0] pwq 10: cpus=0-1 node=0 flags=0x4 nice=0 active=1 refcnt=2
[ 155.069941][ C0] pending: crng_reseed
[ 155.069956][ C0] workqueue events_power_efficient: flags=0x82
[ 155.072924][ C0] pwq 9: cpus=0-1 node=0 flags=0x4 nice=0 active=4 refcnt=5
[ 155.072944][ C0] pending: do_cache_clean, 2*neigh_periodic_work, check_lifetime
[ 155.072972][ C0] pwq 10: cpus=0-1 node=0 flags=0x4 nice=0 active=2 refcnt=3
[ 155.072984][ C0] pending: 2*neigh_managed_work
[ 155.072996][ C0] workqueue mm_percpu_wq: flags=0x8
[ 155.078563][ C0] pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2
[ 155.078581][ C0] pending: vmstat_update
[ 155.078679][ C0] workqueue ipv6_addrconf: flags=0x6000a
[ 155.081392][ C0] pwq 8: cpus=0-1 flags=0x4 nice=0 active=1 refcnt=4
[ 155.081409][ C0] pending: addrconf_verify_work
[ 155.081427][ C0] pool 2: cpus=0 node=0 flags=0x0 nice=0 hung=57s workers=4 idle: 223 94
[ 155.081475][ C0] Showing backtraces of running workers in stalled CPU-bound worker pools:
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250921/202509212119.eab661a8-lkp@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists