lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202512161547.cd3a9187-lkp@intel.com>
Date: Tue, 16 Dec 2025 15:43:59 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Pingfan Liu <piliu@...hat.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Tejun Heo <tj@...nel.org>, Waiman Long <longman@...hat.com>, Chen Ridong
	<chenridong@...weicloud.com>, Peter Zijlstra <peterz@...radead.org>, "Juri
 Lelli" <juri.lelli@...hat.com>, Pierre Gondois <pierre.gondois@....com>,
	"Ingo Molnar" <mingo@...hat.com>, Vincent Guittot
	<vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Mel
 Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
	<aubrey.li@...ux.intel.com>, <yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [sched/deadline]  318e18ed22:
 BUG:soft_lockup-CPU##stuck_for#s![swapper:#]



Hello,

kernel test robot noticed "BUG:soft_lockup-CPU##stuck_for#s![swapper:#]" on:

commit: 318e18ed22e89397635e15095c014accaf47ed30 ("sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master      d358e5254674b70f34c847715ca509e46eb81e6f]
[test failed on linux-next/master 5ce74bc1b7cb2732b22f9c93082545bc655d6547]

in testcase: trinity
version: trinity-static-i386-x86_64-f93256fb_2019-08-28
with following parameters:

	runtime: 300s
	group: group-03
	nr_groups: 5


config: i386-randconfig-r071-20250410
compiler: gcc-14
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 32G

(please refer to attached dmesg/kmsg for entire log/backtrace)


we don't have enough knowledge to analyze the relation between the change
and the issue, so we run tests up to 1000 times. the issue can be reproduced
65 times out of 1000 runs. while parent always keeps clean.

=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/runtime/group/nr_groups:
  vm-snb/trinity/openwrt-i386-generic-20190428.cgz/i386-randconfig-r071-20250410/gcc-14/300s/group-03/5


1f382215119a0bc1 318e18ed22e89397635e15095c0
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :1000         8%          82:1000  dmesg.BUG:kernel_hang_in_boot_stage
           :1000         7%          69:1000  dmesg.BUG:soft_lockup-CPU##stuck_for#s![swapper:#]   <----
           :1000         8%          82:1000  dmesg.BUG:workqueue_lockup-pool
           :1000         7%          69:1000  dmesg.EIP:tick_clock_notify
           :1000         2%          15:1000  dmesg.INFO:rcu_preempt_detected_stalls_on_CPUs/tasks
           :1000         5%          53:1000  dmesg.INFO:task_blocked_for_more_than#seconds
           :1000         7%          69:1000  dmesg.Kernel_panic-not_syncing:softlockup:hung_tasks



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202512161547.cd3a9187-lkp@intel.com


[  699.774873][    C0] watchdog: BUG: soft lockup - CPU#0 stuck for 626s! [swapper/0:1]
[  699.775553][    C0] CPU#0 Utilization every 96000ms during lockup:
[  699.775553][    C0] 	#1:  26% system,	  0% softirq,	  0% hardirq,	  0% idle
[  699.775553][    C0] 	#2:  25% system,	  0% softirq,	  0% hardirq,	  0% idle
[  699.775553][    C0] 	#3:  25% system,	  0% softirq,	  0% hardirq,	  0% idle
[  699.775553][    C0] 	#4:  34% system,	  0% softirq,	  0% hardirq,	  0% idle
[  699.775553][    C0] 	#5: 100% system,	  0% softirq,	  0% hardirq,	  0% idle
[  699.775553][    C0] Modules linked in:
[  699.775553][    C0] irq event stamp: 201566
[  699.775553][    C0] hardirqs last  enabled at (201565): timekeeping_notify (arch/x86/include/asm/irqflags.h:42 arch/x86/include/asm/irqflags.h:119 arch/x86/include/asm/irqflags.h:159 include/linux/stop_machine.h:172 include/linux/stop_machine.h:179 kernel/time/timekeeping.c:1634)
[  699.775553][    C0] hardirqs last disabled at (201566): sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1052)
[  699.775553][    C0] softirqs last  enabled at (200324): handle_softirqs (kernel/softirq.c:469 (discriminator 2) kernel/softirq.c:650 (discriminator 2))
[  699.775553][    C0] softirqs last disabled at (200309): __do_softirq (kernel/softirq.c:657)
[  699.775553][    C0] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.18.0-rc2-00020-g318e18ed22e8 #1 PREEMPT(full)
[  699.775553][    C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  699.775553][    C0] EIP: tick_clock_notify (arch/x86/include/asm/bitops.h:55 include/asm-generic/bitops/instrumented-atomic.h:29 kernel/time/tick-sched.c:1633)
[  699.775553][    C0] Code: 8b 45 e4 89 1d 24 d5 6a 83 a3 38 d5 6a 83 89 15 3c d5 6a 83 83 c4 10 5b 5e 5f 5d c3 2e 8d b4 26 00 00 00 00 8d b6 00 00 00 00 <80> 0d 44 d5 6a 83 01 c3 2e 8d b4 26 00 00 00 00 80 0d 44 d5 6a 83
All code
========
   0:	8b 45 e4             	mov    -0x1c(%rbp),%eax
   3:	89 1d 24 d5 6a 83    	mov    %ebx,-0x7c952adc(%rip)        # 0xffffffff836ad52d
   9:	a3 38 d5 6a 83 89 15 	movabs %eax,0xd53c1589836ad538
  10:	3c d5 
  12:	6a 83                	push   $0xffffffffffffff83
  14:	83 c4 10             	add    $0x10,%esp
  17:	5b                   	pop    %rbx
  18:	5e                   	pop    %rsi
  19:	5f                   	pop    %rdi
  1a:	5d                   	pop    %rbp
  1b:	c3                   	ret
  1c:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
  23:	00 
  24:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
  2a:*	80 0d 44 d5 6a 83 01 	orb    $0x1,-0x7c952abc(%rip)        # 0xffffffff836ad575		<-- trapping instruction
  31:	c3                   	ret
  32:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
  39:	00 
  3a:	80                   	.byte 0x80
  3b:	0d 44 d5 6a 83       	or     $0x836ad544,%eax

Code starting with the faulting instruction
===========================================
   0:	80 0d 44 d5 6a 83 01 	orb    $0x1,-0x7c952abc(%rip)        # 0xffffffff836ad54b
   7:	c3                   	ret
   8:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
   f:	00 
  10:	80                   	.byte 0x80
  11:	0d 44 d5 6a 83       	or     $0x836ad544,%eax
[  699.775553][    C0] EAX: 0003135d EBX: 8322ef00 ECX: 00000006 EDX: 82f6bcac
[  699.775553][    C0] ESI: 00000200 EDI: 836ac3e0 EBP: 84c97ed8 ESP: 84c97ebc
[  699.775553][    C0] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00000202
[  699.775553][    C0] CR0: 80050033 CR2: ffdaa000 CR3: 03aeb000 CR4: 000406d0
[  699.775553][    C0] Call Trace:
[  699.775553][    C0]  ? timekeeping_notify (kernel/time/timekeeping.c:1636)
[  699.775553][    C0]  __clocksource_select (kernel/time/clocksource.c:1069 (discriminator 1))
[  699.775553][    C0]  ? boot_override_clock (kernel/time/clocksource.c:1101)
[  699.775553][    C0]  clocksource_select (kernel/time/clocksource.c:1086)
[  699.775553][    C0]  clocksource_done_booting (kernel/time/clocksource.c:1110)
[  699.775553][    C0]  do_one_initcall (init/main.c:1283)
[  699.775553][    C0]  ? rdinit_setup (init/main.c:1331)
[  699.775553][    C0]  do_initcalls (init/main.c:1344 (discriminator 3) init/main.c:1361 (discriminator 3))
[  699.775553][    C0]  kernel_init_freeable (init/main.c:1597)
[  699.775553][    C0]  ? rest_init (init/main.c:1475)
[  699.775553][    C0]  kernel_init (init/main.c:1485)
[  699.775553][    C0]  ret_from_fork (arch/x86/kernel/process.c:164)
[  699.775553][    C0]  ? rest_init (init/main.c:1475)
[  699.775553][    C0]  ret_from_fork_asm (arch/x86/entry/entry_32.S:737)
[  699.775553][    C0]  entry_INT80_32 (arch/x86/entry/entry_32.S:945)
[  699.775553][    C0] Kernel panic - not syncing: softlockup: hung tasks
[  699.775553][    C0] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Tainted: G             L      6.18.0-rc2-00020-g318e18ed22e8 #1 PREEMPT(full)
[  699.775553][    C0] Tainted: [L]=SOFTLOCKUP
[  699.775553][    C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  699.775553][    C0] Call Trace:
[  699.775553][    C0]  dump_stack_lvl (lib/dump_stack.c:122)
[  699.775553][    C0]  dump_stack (lib/dump_stack.c:130)
[  699.775553][    C0]  vpanic (kernel/panic.c:487)
[  699.775553][    C0]  panic (kernel/panic.c:626)
[  699.775553][    C0]  watchdog_timer_fn (kernel/watchdog.c:753)
[  699.775553][    C0]  __hrtimer_run_queues+0x125/0x1e0
[  699.775553][    C0]  ? schedule_work (drivers/usb/core/hub.c:925)
[  699.775553][    C0]  hrtimer_run_queues (kernel/time/hrtimer.c:1999)
[  699.775553][    C0]  update_process_times (kernel/time/timer.c:2416 kernel/time/timer.c:2472)
[  699.775553][    C0]  tick_periodic (kernel/time/tick-common.c:103)
[  699.775553][    C0]  tick_handle_periodic (kernel/time/tick-common.c:144)
[  699.775553][    C0]  ? vmware_sched_clock (arch/x86/kernel/apic/apic.c:1052)
[  699.775553][    C0]  __sysvec_apic_timer_interrupt (arch/x86/include/asm/trace/irq_vectors.h:40 (discriminator 4) arch/x86/include/asm/trace/irq_vectors.h:40 (discriminator 4) arch/x86/kernel/apic/apic.c:1059 (discriminator 4))
[  699.775553][    C0]  sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1052 (discriminator 2) arch/x86/kernel/apic/apic.c:1052 (discriminator 2))
[  699.775553][    C0]  handle_exception (arch/x86/entry/entry_32.S:1055)
[  699.775553][    C0] EIP: tick_clock_notify (arch/x86/include/asm/bitops.h:55 include/asm-generic/bitops/instrumented-atomic.h:29 kernel/time/tick-sched.c:1633)
[  699.775553][    C0] Code: 8b 45 e4 89 1d 24 d5 6a 83 a3 38 d5 6a 83 89 15 3c d5 6a 83 83 c4 10 5b 5e 5f 5d c3 2e 8d b4 26 00 00 00 00 8d b6 00 00 00 00 <80> 0d 44 d5 6a 83 01 c3 2e 8d b4 26 00 00 00 00 80 0d 44 d5 6a 83
All code
========
   0:	8b 45 e4             	mov    -0x1c(%rbp),%eax
   3:	89 1d 24 d5 6a 83    	mov    %ebx,-0x7c952adc(%rip)        # 0xffffffff836ad52d
   9:	a3 38 d5 6a 83 89 15 	movabs %eax,0xd53c1589836ad538
  10:	3c d5 
  12:	6a 83                	push   $0xffffffffffffff83
  14:	83 c4 10             	add    $0x10,%esp
  17:	5b                   	pop    %rbx
  18:	5e                   	pop    %rsi
  19:	5f                   	pop    %rdi
  1a:	5d                   	pop    %rbp
  1b:	c3                   	ret
  1c:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
  23:	00 
  24:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
  2a:*	80 0d 44 d5 6a 83 01 	orb    $0x1,-0x7c952abc(%rip)        # 0xffffffff836ad575		<-- trapping instruction
  31:	c3                   	ret
  32:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
  39:	00 
  3a:	80                   	.byte 0x80
  3b:	0d 44 d5 6a 83       	or     $0x836ad544,%eax

Code starting with the faulting instruction
===========================================
   0:	80 0d 44 d5 6a 83 01 	orb    $0x1,-0x7c952abc(%rip)        # 0xffffffff836ad54b
   7:	c3                   	ret
   8:	2e 8d b4 26 00 00 00 	cs lea 0x0(%rsi,%riz,1),%esi
   f:	00 
  10:	80                   	.byte 0x80
  11:	0d 44 d5 6a 83       	or     $0x836ad544,%eax


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20251216/202512161547.cd3a9187-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ