lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 17 Nov 2023 16:53:57 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        John Ogness <john.ogness@...utronix.de>,
        <linux-kernel@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [rt-devel:linux-6.6.y-rt-rebase] [panic]  768c33033e:
 BUG:workqueue_lockup-pool



Hello,

kernel test robot noticed "BUG:workqueue_lockup-pool" on:

commit: 768c33033e41ad195a9304ebb183afb730d5ae98 ("panic: Mark emergency section in warn")
https://git.kernel.org/cgit/linux/kernel/git/rt/linux-rt-devel.git linux-6.6.y-rt-rebase

in testcase: boot

compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


+----------------------------------------------------------------------------------+------------+------------+
|                                                                                  | fd2ace830c | 768c33033e |
+----------------------------------------------------------------------------------+------------+------------+
| BUG:workqueue_lockup-pool                                                        | 0          | 12         |
| INFO:rcu_sched_self-detected_stall_on_CPU                                        | 0          | 12         |
| EIP:desc_read                                                                    | 0          | 12         |
+----------------------------------------------------------------------------------+------------+------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202311171611.78d41dbe-oliver.sang@intel.com


[   64.476562][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 39s!
[   64.495685][    C0] Showing busy workqueues and worker pools:
[   64.496091][    C0] workqueue events: flags=0x0
[   64.496427][    C0]   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[   64.496432][    C0]     pending: deferred_probe_timeout_work_func
[   64.496442][    C0] workqueue events_power_efficient: flags=0x82
[   64.497853][    C0]   pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256 refcnt=2
[   64.497856][    C0]     pending: do_cache_clean
[   64.497862][    C0] workqueue mm_percpu_wq: flags=0x8
[   64.499031][    C0]   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[   64.499034][    C0]     pending: vmstat_update
[   64.499044][    C0] Showing backtraces of running workers in stalled CPU-bound worker pools:
[   95.196556][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 70s!
[   95.197232][    C0] Showing busy workqueues and worker pools:
[   95.197640][    C0] workqueue events: flags=0x0
[   95.197963][    C0]   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[   95.197969][    C0]     pending: deferred_probe_timeout_work_func
[   95.197976][    C0] workqueue events_power_efficient: flags=0x82
[   95.199381][    C0]   pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256 refcnt=2
[   95.199384][    C0]     pending: do_cache_clean
[   95.199390][    C0] workqueue mm_percpu_wq: flags=0x8
[   95.200580][    C0]   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[   95.200583][    C0]     pending: vmstat_update
[   95.200593][    C0] Showing backtraces of running workers in stalled CPU-bound worker pools:
[  125.916557][    C0] BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=0 stuck for 101s!
[  125.917235][    C0] Showing busy workqueues and worker pools:
[  125.917650][    C0] workqueue events: flags=0x0
[  125.917979][    C0]   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[  125.917984][    C0]     pending: deferred_probe_timeout_work_func
[  125.917992][    C0] workqueue events_power_efficient: flags=0x82
[  125.919409][    C0]   pwq 4: cpus=0-1 flags=0x4 nice=0 active=1/256 refcnt=2
[  125.919412][    C0]     pending: do_cache_clean
[  125.919419][    C0] workqueue mm_percpu_wq: flags=0x8
[  125.920619][    C0]   pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[  125.920622][    C0]     pending: vmstat_update
[  125.920632][    C0] Showing backtraces of running workers in stalled CPU-bound worker pools:
[  129.568547][    C1] rcu: INFO: rcu_sched self-detected stall on CPU
[  129.568553][    C1] rcu: 	1-....: (26250 ticks this GP) idle=3c24/1/0x40000000 softirq=5858/5934 fqs=13122
[  129.568558][    C1] rcu: 	(t=26251 jiffies g=301 q=598 ncpus=2)
[  129.568562][    C1] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W        N 6.6.0-00134-g768c33033e41 #1
[ 129.568565][ C1] EIP: desc_read (kernel/printk/printk_ringbuffer.c:439) 
[ 129.568573][ C1] Code: fb 83 c4 08 5b 5e 5f 5d c3 8d 74 26 00 0f 0b eb d2 3e 8d 74 26 00 55 89 e5 57 89 cf 56 53 89 c3 83 ec 08 8b 08 b8 ff ff ff ff <89> 55 f0 d3 e0 f7 d0 21 d0 6b f0 58 83 f9 1f 0f 87 a7 e6 83 00 03
All code
========
   0:	fb                   	sti
   1:	83 c4 08             	add    $0x8,%esp
   4:	5b                   	pop    %rbx
   5:	5e                   	pop    %rsi
   6:	5f                   	pop    %rdi
   7:	5d                   	pop    %rbp
   8:	c3                   	ret
   9:	8d 74 26 00          	lea    0x0(%rsi,%riz,1),%esi
   d:	0f 0b                	ud2
   f:	eb d2                	jmp    0xffffffffffffffe3
  11:	3e 8d 74 26 00       	ds lea 0x0(%rsi,%riz,1),%esi
  16:	55                   	push   %rbp
  17:	89 e5                	mov    %esp,%ebp
  19:	57                   	push   %rdi
  1a:	89 cf                	mov    %ecx,%edi
  1c:	56                   	push   %rsi
  1d:	53                   	push   %rbx
  1e:	89 c3                	mov    %eax,%ebx
  20:	83 ec 08             	sub    $0x8,%esp
  23:	8b 08                	mov    (%rax),%ecx
  25:	b8 ff ff ff ff       	mov    $0xffffffff,%eax
  2a:*	89 55 f0             	mov    %edx,-0x10(%rbp)		<-- trapping instruction
  2d:	d3 e0                	shl    %cl,%eax
  2f:	f7 d0                	not    %eax
  31:	21 d0                	and    %edx,%eax
  33:	6b f0 58             	imul   $0x58,%eax,%esi
  36:	83 f9 1f             	cmp    $0x1f,%ecx
  39:	0f 87 a7 e6 83 00    	ja     0x83e6e6
  3f:	03                   	.byte 0x3

Code starting with the faulting instruction
===========================================
   0:	89 55 f0             	mov    %edx,-0x10(%rbp)
   3:	d3 e0                	shl    %cl,%eax
   5:	f7 d0                	not    %eax
   7:	21 d0                	and    %edx,%eax
   9:	6b f0 58             	imul   $0x58,%eax,%esi
   c:	83 f9 1f             	cmp    $0x1f,%ecx
   f:	0f 87 a7 e6 83 00    	ja     0x83e6bc
  15:	03                   	.byte 0x3
[  129.568575][    C1] EAX: ffffffff EBX: c1d5cd60 ECX: 0000000f EDX: 3fff8001
[  129.568576][    C1] ESI: 00000001 EDI: c320fd78 EBP: c320fd28 ESP: c320fd14
[  129.568578][    C1] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00000286
[  129.568582][    C1] CR0: 80050033 CR2: b7d77b93 CR3: 022b4000 CR4: 00040690
[  129.568583][    C1] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  129.568584][    C1] DR6: fffe0ff0 DR7: 00000400
[  129.568585][    C1] Call Trace:
[ 129.568589][ C1] ? show_regs (arch/x86/kernel/dumpstack.c:479) 
[ 129.568596][ C1] ? dump_cpu_task (kernel/sched/core.c:11572) 
[ 129.568602][ C1] ? rcu_dump_cpu_stacks (include/linux/cpumask.h:143 include/linux/cpumask.h:150 include/linux/cpumask.h:212 kernel/rcu/tree_stall.h:367) 
[ 129.568606][ C1] ? vprintk (kernel/printk/printk_safe.c:46) 
[ 129.568611][ C1] ? print_cpu_stall (kernel/rcu/tree_stall.h:692) 
[ 129.568616][ C1] ? check_cpu_stall (kernel/rcu/tree_stall.h:775) 
[ 129.568620][ C1] ? rcu_sched_clock_irq (kernel/rcu/tree.c:3869 kernel/rcu/tree.c:2240) 
[ 129.568623][ C1] ? update_process_times (arch/x86/include/asm/preempt.h:27 kernel/time/timer.c:2080) 
[ 129.568628][ C1] ? tick_sched_timer (kernel/time/tick-sched.c:1497) 
[ 129.568632][ C1] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1688 kernel/time/hrtimer.c:1752) 
[ 129.568636][ C1] ? cpumask_weight (arch/x86/kernel/alternative.c:1766) 
[ 129.568640][ C1] ? hrtimer_interrupt (kernel/time/hrtimer.c:1817) 
[ 129.568645][ C1] ? sysvec_call_function_single (arch/x86/kernel/apic/apic.c:1074) 
[ 129.568650][ C1] ? __sysvec_apic_timer_interrupt (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:444 include/linux/jump_label.h:260 include/linux/jump_label.h:270 arch/x86/include/asm/trace/irq_vectors.h:41 arch/x86/kernel/apic/apic.c:1081) 
[ 129.568653][ C1] ? sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074) 
[ 129.568656][ C1] ? handle_exception (arch/x86/entry/entry_32.S:1049) 
[ 129.568662][ C1] ? sysvec_call_function_single (arch/x86/kernel/apic/apic.c:1074) 
[ 129.568664][ C1] ? desc_read (kernel/printk/printk_ringbuffer.c:439) 
[ 129.568668][ C1] ? sysvec_call_function_single (arch/x86/kernel/apic/apic.c:1074) 
[ 129.568670][ C1] ? desc_read (kernel/printk/printk_ringbuffer.c:439) 
[ 129.568675][ C1] desc_read_finalized_seq (kernel/printk/printk_ringbuffer.c:1907) 
[ 129.568680][ C1] prb_next_reserve_seq (kernel/printk/printk_ringbuffer.c:2084) 
[ 129.568682][ C1] ? nbcon_get_cpu_emergency_nesting (kernel/printk/nbcon.c:984) 
[ 129.568689][ C1] nbcon_atomic_flush_all (kernel/printk/nbcon.c:1133) 
[ 129.568692][ C1] ? nbcon_atomic_flush_all (kernel/printk/nbcon.c:1133) 
[ 129.568695][ C1] printk_trigger_flush (arch/x86/include/asm/preempt.h:85 kernel/printk/printk.c:3993) 
[ 129.568698][ C1] nbcon_cpu_emergency_exit (kernel/printk/nbcon.c:1193) 
[ 129.568701][ C1] ? refcount_warn_saturate (lib/refcount.c:28 (discriminator 3)) 
[ 129.568706][ C1] __warn (kernel/panic.c:692) 
[ 129.568711][ C1] ? refcount_warn_saturate (lib/refcount.c:28 (discriminator 3)) 
[ 129.568714][ C1] report_bug (lib/bug.c:201 lib/bug.c:219) 
[ 129.568720][ C1] ? exc_overflow (arch/x86/kernel/traps.c:250) 
[ 129.568722][ C1] handle_bug (arch/x86/kernel/traps.c:216) 
[ 129.568724][ C1] exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) 
[ 129.568727][ C1] handle_exception (arch/x86/entry/entry_32.S:1049) 
[ 129.568729][ C1] EIP: refcount_warn_saturate (lib/refcount.c:28 (discriminator 3)) 
[ 129.568731][ C1] Code: 79 18 c2 01 68 a8 b9 c3 c1 e8 03 7b cf ff 0f 0b 58 c9 c3 8d b6 00 00 00 00 c6 05 86 79 18 c2 01 68 50 b9 c3 c1 e8 e7 7a cf ff <0f> 0b 59 c9 c3 66 90 89 c2 8b 00 3d 00 00 00 c0 74 12 83 f8 01 74
All code
========
   0:	79 18                	jns    0x1a
   2:	c2 01 68             	ret    $0x6801
   5:	a8 b9                	test   $0xb9,%al
   7:	c3                   	ret
   8:	c1 e8 03             	shr    $0x3,%eax
   b:	7b cf                	jnp    0xffffffffffffffdc
   d:	ff 0f                	decl   (%rdi)
   f:	0b 58 c9             	or     -0x37(%rax),%ebx
  12:	c3                   	ret
  13:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
  19:	c6 05 86 79 18 c2 01 	movb   $0x1,-0x3de7867a(%rip)        # 0xffffffffc21879a6
  20:	68 50 b9 c3 c1       	push   $0xffffffffc1c3b950
  25:	e8 e7 7a cf ff       	call   0xffffffffffcf7b11
  2a:*	0f 0b                	ud2		<-- trapping instruction
  2c:	59                   	pop    %rcx
  2d:	c9                   	leave
  2e:	c3                   	ret
  2f:	66 90                	xchg   %ax,%ax
  31:	89 c2                	mov    %eax,%edx
  33:	8b 00                	mov    (%rax),%eax
  35:	3d 00 00 00 c0       	cmp    $0xc0000000,%eax
  3a:	74 12                	je     0x4e
  3c:	83 f8 01             	cmp    $0x1,%eax
  3f:	74                   	.byte 0x74

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2
   2:	59                   	pop    %rcx
   3:	c9                   	leave
   4:	c3                   	ret
   5:	66 90                	xchg   %ax,%ax
   7:	89 c2                	mov    %eax,%edx
   9:	8b 00                	mov    (%rax),%eax
   b:	3d 00 00 00 c0       	cmp    $0xc0000000,%eax
  10:	74 12                	je     0x24
  12:	83 f8 01             	cmp    $0x1,%eax
  15:	74                   	.byte 0x74


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231117/202311171611.78d41dbe-oliver.sang@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ