lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210301083046.GG12822@xsang-OptiPlex-9020>
Date:   Mon, 1 Mar 2021 16:30:46 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     Song Liu <songliubraving@...com>,
        John Stultz <john.stultz@...aro.org>,
        Douglas Anderson <dianders@...omium.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com
Subject: [task_work]  35d0b389f3: BUG:workqueue_lockup-pool


Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 35d0b389f3b23439ad15b610d6e43fc72fc75779 ("task_work: unconditionally run task_work from get_signal()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: trinity
version: trinity-static-i386-x86_64-f93256fb_2019-08-28
with following parameters:

	runtime: 300s

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+---------------------------------------------------------+-----------+------------+
|                                                         | v5.11-rc2 | 35d0b389f3 |
+---------------------------------------------------------+-----------+------------+
| BUG:workqueue_lockup-pool                               | 0         | 7          |
| INFO:rcu_preempt_detected_stalls_on_CPUs/tasks          | 0         | 2          |
| BUG:scheduling_while_atomic                             | 0         | 1          |
| BUG:spinlock_recursion_on_CPU                           | 0         | 2          |
| EIP:preempt_schedule_common                             | 0         | 1          |
| EIP:_raw_spin_unlock_irq                                | 0         | 1          |
| EIP:preempt_count_sub                                   | 0         | 1          |
+---------------------------------------------------------+-----------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[   60.721590] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 45s!
[   60.722087] Showing busy workqueues and worker pools:
[   60.722384] workqueue events: flags=0x0
[   60.722612]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=3/256 refcnt=4
[   60.723012]     pending: vmstat_shepherd, cache_reap, regulator_init_complete_work_function
[   60.723511] workqueue rcu_gp: flags=0x8
[   60.723740]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[   60.724147]     pending: strict_work_handler
[   60.724405] workqueue mm_percpu_wq: flags=0x8
[   60.724671]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[   60.725071]     pending: vmstat_update
[   90.801593] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 75s!
[   90.802096] Showing busy workqueues and worker pools:
[   90.802394] workqueue events: flags=0x0
[   90.802623]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=3/256 refcnt=4
[   90.803030]     pending: vmstat_shepherd, cache_reap, regulator_init_complete_work_function
[   90.803528] workqueue rcu_gp: flags=0x8
[   90.803757]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[   90.804157]     pending: strict_work_handler
[   90.804415] workqueue mm_percpu_wq: flags=0x8
[   90.804675]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[   90.805088]     pending: vmstat_update
[  115.721589] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  115.721985] 	(detected by 1, t=10002 jiffies, g=6965, q=871)
[  115.722324] rcu: All QSes seen, last rcu_preempt kthread activity 10002 (-18436--28438), jiffies_till_next_fqs=1, root ->qsmask 0x0
[  115.723004] rcu: rcu_preempt kthread starved for 10002 jiffies! g6965 f0x0 RCU_GP_ONOFF(3) ->state=0x0 ->cpu=0
[  115.723581] rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[  115.724106] rcu: RCU grace-period kthread stack dump:
[  115.724402] task:rcu_preempt     state:R  running task     stack:    0 pid:   10 ppid:     2 flags:0x00004008
[  115.724975] Call Trace:
[  115.725127] __schedule (kbuild/src/consumer/kernel/sched/core.c:4316 kbuild/src/consumer/kernel/sched/core.c:5064) 
[  115.725340] ? __queue_work (kbuild/src/consumer/kernel/workqueue.c:1470) 
[  115.725570] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.725864] preempt_schedule_irq (kbuild/src/consumer/arch/x86/include/asm/irqflags.h:49 (discriminator 1) kbuild/src/consumer/arch/x86/include/asm/irqflags.h:89 (discriminator 1) kbuild/src/consumer/kernel/sched/core.c:5327 (discriminator 1)) 
[  115.726123] irqentry_exit_cond_resched (??:?) 
[  115.726437] irqentry_exit (kbuild/src/consumer/kernel/entry/common.c:435) 
[  115.726651] sysvec_apic_timer_interrupt (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.726934] handle_exception (kbuild/src/consumer/arch/x86/entry/entry_32.S:1172) 
[  115.727174] EIP: preempt_schedule_common (kbuild/src/consumer/arch/x86/include/asm/bitops.h:207 kbuild/src/consumer/include/asm-generic/bitops/instrumented-non-atomic.h:135 kbuild/src/consumer/include/linux/thread_info.h:102 kbuild/src/consumer/include/linux/sched.h:1917 kbuild/src/consumer/kernel/sched/core.c:5232) 
[ 115.727457] Code: f5 cb 89 de 64 ff 05 ac 77 f5 cb 64 a1 ac 77 f5 cb 25 ff ff ff 7f 48 74 2b b8 01 00 00 00 e8 c9 f4 ff ff 64 ff 0d ac 77 f5 cb <8b> 06 a8 08 75 d4 8b 5d f4 8b 75 f8 8b 7d fc 89 ec 5d c3 8d b4 26
All code
========
   0:	f5                   	cmc    
   1:	cb                   	lret   
   2:	89 de                	mov    %ebx,%esi
   4:	64 ff 05 ac 77 f5 cb 	incl   %fs:-0x340a8854(%rip)        # 0xffffffffcbf577b7
   b:	64 a1 ac 77 f5 cb 25 	movabs %fs:0xffffff25cbf577ac,%eax
  12:	ff ff ff 
  15:	7f 48                	jg     0x5f
  17:	74 2b                	je     0x44
  19:	b8 01 00 00 00       	mov    $0x1,%eax
  1e:	e8 c9 f4 ff ff       	callq  0xfffffffffffff4ec
  23:	64 ff 0d ac 77 f5 cb 	decl   %fs:-0x340a8854(%rip)        # 0xffffffffcbf577d6
  2a:*	8b 06                	mov    (%rsi),%eax		<-- trapping instruction
  2c:	a8 08                	test   $0x8,%al
  2e:	75 d4                	jne    0x4
  30:	8b 5d f4             	mov    -0xc(%rbp),%ebx
  33:	8b 75 f8             	mov    -0x8(%rbp),%esi
  36:	8b 7d fc             	mov    -0x4(%rbp),%edi
  39:	89 ec                	mov    %ebp,%esp
  3b:	5d                   	pop    %rbp
  3c:	c3                   	retq   
  3d:	8d                   	.byte 0x8d
  3e:	b4 26                	mov    $0x26,%ah

Code starting with the faulting instruction
===========================================
   0:	8b 06                	mov    (%rsi),%eax
   2:	a8 08                	test   $0x8,%al
   4:	75 d4                	jne    0xffffffffffffffda
   6:	8b 5d f4             	mov    -0xc(%rbp),%ebx
   9:	8b 75 f8             	mov    -0x8(%rbp),%esi
   c:	8b 7d fc             	mov    -0x4(%rbp),%edi
   f:	89 ec                	mov    %ebp,%esp
  11:	5d                   	pop    %rbp
  12:	c3                   	retq   
  13:	8d                   	.byte 0x8d
  14:	b4 26                	mov    $0x26,%ah
[  115.728529] EAX: d7ed9940 EBX: c107c280 ECX: 00000000 EDX: 00000002
[  115.728895] ESI: c107c280 EDI: cb40d425 EBP: c107ff20 ESP: c107ff14
[  115.729259] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000246
[  115.729653] ? preempt_schedule (kbuild/src/consumer/kernel/sched/core.c:5250) 
[  115.729891] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.730186] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.730479] ? preempt_schedule_common (kbuild/src/consumer/arch/x86/include/asm/bitops.h:207 kbuild/src/consumer/include/asm-generic/bitops/instrumented-non-atomic.h:135 kbuild/src/consumer/include/linux/thread_info.h:102 kbuild/src/consumer/include/linux/sched.h:1917 kbuild/src/consumer/kernel/sched/core.c:5232) 
[  115.730751] preempt_schedule (kbuild/src/consumer/kernel/sched/core.c:5250) 
[  115.730979] preempt_schedule_thunk (kbuild/src/consumer/arch/x86/entry/thunk_32.S:33) 
[  115.731227] ? _raw_spin_unlock (kbuild/src/consumer/kernel/locking/spinlock.c:184) 
[  115.731465] rcu_gp_kthread (kbuild/src/consumer/kernel/rcu/tree.c:1769 kbuild/src/consumer/kernel/rcu/tree.c:2103) 
[  115.731694] kthread (kbuild/src/consumer/kernel/kthread.c:292) 
[  115.731888] ? dump_blkd_tasks+0x40/0x40 
[  115.732180] ? kthread_create_worker_on_cpu (kbuild/src/consumer/kernel/kthread.c:245) 
[  115.732484] ret_from_fork (kbuild/src/consumer/arch/x86/entry/entry_32.S:856) 
[  115.732723] BUG: spinlock recursion on CPU#1, udevd/275
[  115.733029] lock: rcu_state+0x0/0x2aa0, .magic: dead4ead, .owner: udevd/275, .owner_cpu: 1 
[  115.733514] CPU: 1 PID: 275 Comm: udevd Not tainted 5.11.0-rc2-00001-g35d0b389f3b2 #2
[  115.733968] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[  115.734449] Call Trace:
[  115.734596] dump_stack (kbuild/src/consumer/lib/dump_stack.c:122) 
[  115.734796] spin_dump.cold (kbuild/src/consumer/kernel/locking/spinlock_debug.c:68 (discriminator 3)) 
[  115.735014] do_raw_spin_lock (kbuild/src/consumer/kernel/locking/spinlock_debug.c:75 kbuild/src/consumer/kernel/locking/spinlock_debug.c:84 kbuild/src/consumer/kernel/locking/spinlock_debug.c:112) 
[  115.735242] _raw_spin_lock_irqsave (kbuild/src/consumer/kernel/locking/spinlock.c:160) 
[  115.735498] ? rcu_force_quiescent_state (kbuild/src/consumer/kernel/rcu/tree.c:2664) 
[  115.735779] rcu_force_quiescent_state (kbuild/src/consumer/kernel/rcu/tree.c:2664) 
[  115.736049] rcu_sched_clock_irq.cold (kbuild/src/consumer/kernel/rcu/tree_stall.h:536 kbuild/src/consumer/kernel/rcu/tree_stall.h:652 kbuild/src/consumer/kernel/rcu/tree.c:3751 kbuild/src/consumer/kernel/rcu/tree.c:2580) 
[  115.736326] ? _raw_spin_unlock_irqrestore (kbuild/src/consumer/arch/x86/include/asm/preempt.h:102 kbuild/src/consumer/include/linux/spinlock_api_smp.h:161 kbuild/src/consumer/kernel/locking/spinlock.c:191) 
[  115.736617] ? hrtimer_run_queues (kbuild/src/consumer/kernel/time/hrtimer.c:1760) 
[  115.736864] update_process_times (kbuild/src/consumer/arch/x86/include/asm/preempt.h:26 kbuild/src/consumer/kernel/time/timer.c:1784) 
[  115.737109] tick_periodic (kbuild/src/consumer/kernel/time/tick-common.c:101) 
[  115.737329] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.737619] ? debug_smp_processor_id (kbuild/src/consumer/lib/smp_processor_id.c:65) 
[  115.737886] tick_handle_periodic (kbuild/src/consumer/kernel/time/tick-common.c:124) 
[  115.738136] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.738435] __sysvec_apic_timer_interrupt (kbuild/src/consumer/include/asm-generic/irq_regs.h:28 kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1105) 
[  115.738727] ? perf_iterate_sb (kbuild/src/consumer/include/linux/rcupdate.h:695 (discriminator 3) kbuild/src/consumer/kernel/events/core.c:7460 (discriminator 3)) 
[  115.738966] sysvec_apic_timer_interrupt (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.739247] handle_exception (kbuild/src/consumer/arch/x86/entry/entry_32.S:1172) 
[  115.739484] EIP: _raw_spin_unlock_irq (kbuild/src/consumer/include/linux/spinlock_api_smp.h:169 kbuild/src/consumer/kernel/locking/spinlock.c:199) 
[ 115.739751] Code: 00 00 00 00 8d b4 26 00 00 00 00 90 55 89 e5 53 89 c3 8b 55 04 83 c0 10 e8 2f 53 69 ff 89 d8 e8 48 90 69 ff fb b8 01 00 00 00 <e8> 3d c7 66 ff 64 a1 ac 77 f5 cb 85 c0 74 03 5b 5d c3 e8 1a dc 5e
All code
========
   0:	00 00                	add    %al,(%rax)
   2:	00 00                	add    %al,(%rax)
   4:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi
   b:	90                   	nop
   c:	55                   	push   %rbp
   d:	89 e5                	mov    %esp,%ebp
   f:	53                   	push   %rbx
  10:	89 c3                	mov    %eax,%ebx
  12:	8b 55 04             	mov    0x4(%rbp),%edx
  15:	83 c0 10             	add    $0x10,%eax
  18:	e8 2f 53 69 ff       	callq  0xffffffffff69534c
  1d:	89 d8                	mov    %ebx,%eax
  1f:	e8 48 90 69 ff       	callq  0xffffffffff69906c
  24:	fb                   	sti    
  25:	b8 01 00 00 00       	mov    $0x1,%eax
  2a:*	e8 3d c7 66 ff       	callq  0xffffffffff66c76c		<-- trapping instruction
  2f:	64 a1 ac 77 f5 cb 85 	movabs %fs:0x374c085cbf577ac,%eax
  36:	c0 74 03 
  39:	5b                   	pop    %rbx
  3a:	5d                   	pop    %rbp
  3b:	c3                   	retq   
  3c:	e8                   	.byte 0xe8
  3d:	1a dc                	sbb    %ah,%bl
  3f:	5e                   	pop    %rsi

Code starting with the faulting instruction
===========================================
   0:	e8 3d c7 66 ff       	callq  0xffffffffff66c742
   5:	64 a1 ac 77 f5 cb 85 	movabs %fs:0x374c085cbf577ac,%eax
   c:	c0 74 03 
   f:	5b                   	pop    %rbx
  10:	5d                   	pop    %rbp
  11:	c3                   	retq   
  12:	e8                   	.byte 0xe8
  13:	1a dc                	sbb    %ah,%bl
  15:	5e                   	pop    %rsi
[  115.740821] EAX: 00000001 EBX: d7ef7940 ECX: 00000000 EDX: cb9e33fa
[  115.741184] ESI: dd75c240 EDI: 00000000 EBP: dce43d54 ESP: dce43d50
[  115.741547] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00000246
[  115.741940] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.742233] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.742524] ? _raw_spin_unlock_irq (kbuild/src/consumer/include/linux/spinlock_api_smp.h:169 kbuild/src/consumer/kernel/locking/spinlock.c:199) 
[  115.742788] finish_task_switch (kbuild/src/consumer/arch/x86/include/asm/current.h:15 kbuild/src/consumer/kernel/sched/core.c:4091 kbuild/src/consumer/kernel/sched/core.c:4192) 
[  115.743032] ? finish_task_switch (kbuild/src/consumer/kernel/sched/core.c:3965 kbuild/src/consumer/kernel/sched/core.c:4181) 
[  115.743285] __schedule (kbuild/src/consumer/kernel/sched/core.c:5072) 
[  115.743492] ? __queue_work (kbuild/src/consumer/kernel/workqueue.c:1470) 
[  115.743719] ? sysvec_call_function_single (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.744011] preempt_schedule_irq (kbuild/src/consumer/arch/x86/include/asm/irqflags.h:49 (discriminator 1) kbuild/src/consumer/arch/x86/include/asm/irqflags.h:89 (discriminator 1) kbuild/src/consumer/kernel/sched/core.c:5327 (discriminator 1)) 
[  115.744258] irqentry_exit_cond_resched (??:?) 
[  115.744569] irqentry_exit (kbuild/src/consumer/kernel/entry/common.c:435) 
[  115.744782] sysvec_apic_timer_interrupt (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1096) 
[  115.745064] handle_exception (kbuild/src/consumer/arch/x86/entry/entry_32.S:1172) 
[  115.745302] EIP: preempt_count_sub (kbuild/src/consumer/kernel/sched/core.c:4758) 
[ 115.745555] Code: 15 10 22 15 cc 85 d2 75 18 64 8b 15 ac 77 f5 cb 81 e2 ff ff ff 7f 39 c2 7c 17 3d fe 00 00 00 76 40 f7 d8 64 01 05 ac 77 f5 cb <c9> c3 8d 74 26 00 90 e8 23 a0 98 00 85 c0 90 74 ef 8b 0d 0c d5 ea
All code
========
   0:	15 10 22 15 cc       	adc    $0xcc152210,%eax
   5:	85 d2                	test   %edx,%edx
   7:	75 18                	jne    0x21
   9:	64 8b 15 ac 77 f5 cb 	mov    %fs:-0x340a8854(%rip),%edx        # 0xffffffffcbf577bc
  10:	81 e2 ff ff ff 7f    	and    $0x7fffffff,%edx
  16:	39 c2                	cmp    %eax,%edx
  18:	7c 17                	jl     0x31
  1a:	3d fe 00 00 00       	cmp    $0xfe,%eax
  1f:	76 40                	jbe    0x61
  21:	f7 d8                	neg    %eax
  23:	64 01 05 ac 77 f5 cb 	add    %eax,%fs:-0x340a8854(%rip)        # 0xffffffffcbf577d6
  2a:*	c9                   	leaveq 		<-- trapping instruction
  2b:	c3                   	retq   
  2c:	8d 74 26 00          	lea    0x0(%rsi,%riz,1),%esi
  30:	90                   	nop
  31:	e8 23 a0 98 00       	callq  0x98a059
  36:	85 c0                	test   %eax,%eax
  38:	90                   	nop
  39:	74 ef                	je     0x2a
  3b:	8b                   	.byte 0x8b
  3c:	0d                   	.byte 0xd
  3d:	0c d5                	or     $0xd5,%al
  3f:	ea                   	(bad)  

Code starting with the faulting instruction
===========================================
   0:	c9                   	leaveq 
   1:	c3                   	retq   
   2:	8d 74 26 00          	lea    0x0(%rsi,%riz,1),%esi
   6:	90                   	nop
   7:	e8 23 a0 98 00       	callq  0x98a02f
   c:	85 c0                	test   %eax,%eax
   e:	90                   	nop
   f:	74 ef                	je     0x0
  11:	8b                   	.byte 0x8b
  12:	0d                   	.byte 0xd
  13:	0c d5                	or     $0xd5,%al
  15:	ea                   	(bad)  


To reproduce:

        # build kernel
	cd linux
	cp config-5.11.0-rc2-00001-g35d0b389f3b2 .config
	make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



Thanks,
Oliver Sang


View attachment "config-5.11.0-rc2-00001-g35d0b389f3b2" of type "text/plain" (131513 bytes)

View attachment "job-script" of type "text/plain" (4229 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (16732 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ