[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <3fb4046061853ec8786657f0bde7c0b49b4f37e0.camel@kernel.org>
Date: Wed, 25 Jun 2025 08:07:53 -0400
From: Jeff Layton <jlayton@...nel.org>
To: kernel test robot <oliver.sang@...el.com>, Jakub Kicinski
<kuba@...nel.org>
Cc: oe-lkp@...ts.linux.dev, lkp@...el.com, linux-kernel@...r.kernel.org
Subject: Re: [linux-next:master] [ref_tracker] 65b584f536:
BUG:spinlock_trylock_failure_on_UP_on_CPU
On Wed, 2025-06-25 at 14:32 +0800, kernel test robot wrote:
>
> Hello,
>
> kernel test robot noticed "BUG:spinlock_trylock_failure_on_UP_on_CPU" on:
>
> commit: 65b584f5361163ba539d2c7122ca792c3cc87997 ("ref_tracker: automatically register a file in debugfs for a ref_tracker_dir")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>
> [test failed on linux-next/master f817b6dd2b62d921a6cdc0a3ac599cd1851f343c]
>
> in testcase: boot
>
> config: i386-randconfig-141-20250623
> compiler: gcc-12
> test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
> +------------------------------------------------+------------+------------+
> > | f6dbe294a1 | 65b584f536 |
> +------------------------------------------------+------------+------------+
> > BUG:spinlock_trylock_failure_on_UP_on_CPU | 0 | 12 |
> > WARNING:at_kernel/workqueue.c:#__queue_work | 0 | 12 |
> > EIP:__queue_work | 0 | 12 |
> +------------------------------------------------+------------+------------+
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> > Reported-by: kernel test robot <oliver.sang@...el.com>
> > Closes: https://lore.kernel.org/oe-lkp/202506251406.c28f2adb-lkp@intel.com
>
>
> [ 51.542685][ T1] BUG: spinlock trylock failure on UP on CPU#0, swapper/1
> [ 51.543194][ T1] lock: debugfs_dentries+0x0/0x34, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> [ 51.543194][ T1] CPU: 0 UID: 0 PID: 1 Comm: swapper Not tainted 6.16.0-rc2-00006-g65b584f53611 #1 PREEMPTLAZY 672570e0a87e353b344c305ea64104c56bf67f95
> [ 51.543194][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 51.543194][ T1] Call Trace:
> [ 51.543194][ T1] dump_stack_lvl (arch/x86/include/asm/irqflags.h:26 (discriminator 3) arch/x86/include/asm/irqflags.h:109 (discriminator 3) arch/x86/include/asm/irqflags.h:151 (discriminator 3) lib/dump_stack.c:123 (discriminator 3))
> [ 51.543194][ T1] dump_stack (lib/dump_stack.c:130)
> [ 51.543194][ T1] spin_bug (kernel/locking/spinlock_debug.c:71 kernel/locking/spinlock_debug.c:78)
> [ 51.543194][ T1] do_raw_spin_trylock (kernel/locking/spinlock_debug.c:133)
> [ 51.543194][ T1] _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
> [ 51.543194][ T1] ? ref_tracker_dir_exit (lib/ref_tracker.c:54 lib/ref_tracker.c:226)
> [ 51.543194][ T1] ref_tracker_dir_exit (lib/ref_tracker.c:54 lib/ref_tracker.c:226)
> [ 51.543194][ T1] free_netdev (net/core/dev.c:11880)
> [ 51.543194][ T1] smc_init (drivers/net/ethernet/smsc/smc9194.c:729)
> [ 51.543194][ T1] net_olddevs_init (drivers/net/Space.c:191 drivers/net/Space.c:239 drivers/net/Space.c:248)
> [ 51.543194][ T1] ? ether_boot_setup (drivers/net/Space.c:244)
> [ 51.543194][ T1] do_one_initcall (init/main.c:1274)
> [ 51.543194][ T1] ? ether_boot_setup (drivers/net/Space.c:244)
> [ 51.543194][ T1] ? do_one_initcall (init/main.c:1291)
> [ 51.543194][ T1] do_initcalls (init/main.c:1335 init/main.c:1352)
> [ 51.543194][ T1] kernel_init_freeable (init/main.c:1588)
> [ 51.543194][ T1] ? rest_init (init/main.c:1466)
> [ 51.543194][ T1] kernel_init (init/main.c:1476)
> [ 51.543194][ T1] ret_from_fork (arch/x86/kernel/process.c:154)
> [ 51.543194][ T1] ? rest_init (init/main.c:1466)
> [ 51.543194][ T1] ret_from_fork_asm (arch/x86/entry/entry_32.S:737)
> [ 51.543194][ T1] entry_INT80_32 (arch/x86/entry/entry_32.S:945)
> [ 51.578771][ T1] ------------[ cut here ]------------
> [ 51.579764][ T1] WARNING: CPU: 0 PID: 1 at kernel/workqueue.c:2325 __queue_work (kernel/workqueue.c:2325)
> [ 51.581319][ T1] Modules linked in:
> [ 51.582069][ T1] CPU: 0 UID: 0 PID: 1 Comm: swapper Not tainted 6.16.0-rc2-00006-g65b584f53611 #1 PREEMPTLAZY 672570e0a87e353b344c305ea64104c56bf67f95
> [ 51.584508][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 51.586299][ T1] EIP: __queue_work (kernel/workqueue.c:2325)
> [ 51.587177][ T1] Code: ff e8 ca 85 f7 ff e9 e5 fa ff ff 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00 90 e8 a5 85 f7 ff e9 1c fc ff ff 8d b6 00 00 00 00 <0f> 0b 6a 00 31 c9 ba 01 00 00 00 b8 70 b3 dd 8a e8 1b 52 11 00 58
> All code
> ========
> 0: ff (bad)
> 1: e8 ca 85 f7 ff call 0xfffffffffff785d0
> 6: e9 e5 fa ff ff jmp 0xfffffffffffffaf0
> b: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi
> 12: 8d b4 26 00 00 00 00 lea 0x0(%rsi,%riz,1),%esi
> 19: 90 nop
> 1a: e8 a5 85 f7 ff call 0xfffffffffff785c4
> 1f: e9 1c fc ff ff jmp 0xfffffffffffffc40
> 24: 8d b6 00 00 00 00 lea 0x0(%rsi),%esi
> 2a:* 0f 0b ud2 <-- trapping instruction
> 2c: 6a 00 push $0x0
> 2e: 31 c9 xor %ecx,%ecx
> 30: ba 01 00 00 00 mov $0x1,%edx
> 35: b8 70 b3 dd 8a mov $0x8addb370,%eax
> 3a: e8 1b 52 11 00 call 0x11525a
> 3f: 58 pop %rax
>
> Code starting with the faulting instruction
> ===========================================
> 0: 0f 0b ud2
> 2: 6a 00 push $0x0
> 4: 31 c9 xor %ecx,%ecx
> 6: ba 01 00 00 00 mov $0x1,%edx
> b: b8 70 b3 dd 8a mov $0x8addb370,%eax
> 10: e8 1b 52 11 00 call 0x115230
> 15: 58 pop %rax
> [ 51.588586][ T1] EAX: 8addb388 EBX: 00000000 ECX: 00000000 EDX: 00000001
> [ 51.588586][ T1] ESI: 8b872e60 EDI: 8125cd00 EBP: 813bbe34 ESP: 813bbe10
> [ 51.588586][ T1] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010082
> [ 51.588586][ T1] CR0: 80050033 CR2: ffdd9000 CR3: 0b057000 CR4: 00040690
> [ 51.588586][ T1] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [ 51.588586][ T1] DR6: fffe0ff0 DR7: 00000400
> [ 51.588586][ T1] Call Trace:
> [ 51.588586][ T1] queue_work_on (kernel/workqueue.c:2393)
> [ 51.588586][ T1] ref_tracker_dir_exit (lib/ref_tracker.c:227)
> [ 51.588586][ T1] free_netdev (net/core/dev.c:11880)
> [ 51.588586][ T1] smc_init (drivers/net/ethernet/smsc/smc9194.c:729)
> [ 51.588586][ T1] net_olddevs_init (drivers/net/Space.c:191 drivers/net/Space.c:239 drivers/net/Space.c:248)
> [ 51.588586][ T1] ? ether_boot_setup (drivers/net/Space.c:244)
> [ 51.588586][ T1] do_one_initcall (init/main.c:1274)
> [ 51.588586][ T1] ? ether_boot_setup (drivers/net/Space.c:244)
> [ 51.588586][ T1] ? do_one_initcall (init/main.c:1291)
> [ 51.588586][ T1] do_initcalls (init/main.c:1335 init/main.c:1352)
> [ 51.588586][ T1] kernel_init_freeable (init/main.c:1588)
> [ 51.588586][ T1] ? rest_init (init/main.c:1466)
> [ 51.588586][ T1] kernel_init (init/main.c:1476)
> [ 51.588586][ T1] ret_from_fork (arch/x86/kernel/process.c:154)
> [ 51.588586][ T1] ? rest_init (init/main.c:1466)
> [ 51.588586][ T1] ret_from_fork_asm (arch/x86/entry/entry_32.S:737)
> [ 51.588586][ T1] entry_INT80_32 (arch/x86/entry/entry_32.S:945)
> [ 51.588586][ T1] irq event stamp: 225562
> [ 51.588586][ T1] hardirqs last enabled at (225561): _raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
> [ 51.588586][ T1] hardirqs last disabled at (225562): _raw_spin_lock_irqsave (include/linux/spinlock_api_smp.h:108 kernel/locking/spinlock.c:162)
> [ 51.588586][ T1] softirqs last enabled at (225310): neigh_parms_alloc (include/linux/bitmap.h:236 include/net/neighbour.h:113 net/core/neighbour.c:1687)
> [ 51.588586][ T1] softirqs last disabled at (225308): neigh_parms_alloc (include/linux/list.h:169 net/core/neighbour.c:1684)
> [ 51.588586][ T1] ---[ end trace 0000000000000000 ]---
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20250625/202506251406.c28f2adb-lkp@intel.com
>
>
I think I see the problem. The ref_tracker xarray and workqueue job
initializations are happening in late_initcall, but we need to do those
earlier since this is in subsys initcall (I think).
A patch like this should fix it. Is "postcore" the right stage to do
this? It looks like netdevs get set up in "subsys" but I wasn't sure
about the i915 driver.
Jakub, would you like me to send a patch on top of the series, or
should I respin and resend the pile?
Thanks,
------------------------8<---------------------------
[PATCH] ref_tracker: do xarray and workqueue job initializations earlier
Signed-off-by: Jeff Layton <jlayton@...nel.org>
---
lib/ref_tracker.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index dcf923a1edf5..a9e6ffcff04b 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -516,13 +516,19 @@ static void debugfs_reap_work(struct work_struct *work)
} while (reaped);
}
-static int __init ref_tracker_debugfs_init(void)
+static int __init ref_tracker_debugfs_postcore_init(void)
{
INIT_WORK(&debugfs_reap_worker, debugfs_reap_work);
xa_init_flags(&debugfs_dentries, XA_FLAGS_LOCK_IRQ);
xa_init_flags(&debugfs_symlinks, XA_FLAGS_LOCK_IRQ);
+ return 0;
+}
+postcore_initcall(ref_tracker_debugfs_postcore_init);
+
+static int __init ref_tracker_debugfs_late_init(void)
+{
ref_tracker_debug_dir = debugfs_create_dir("ref_tracker", NULL);
return 0;
}
-late_initcall(ref_tracker_debugfs_init);
+late_initcall(ref_tracker_debugfs_late_init);
#endif /* CONFIG_DEBUG_FS */
--
2.49.0
Powered by blists - more mailing lists