lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210407072934.GD22260@xsang-OptiPlex-9020>
Date:   Wed, 7 Apr 2021 15:29:34 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        lkp@...ts.01.org, lkp@...el.com
Subject: [rcu]  7308e02404: INFO:rcu_preempt_detected_stalls_on_CPUs/tasks



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 7308e0240410d3644c9d7cc6263079a58e3effeb ("rcu: Make rcu_read_unlock_special() expedite strict grace periods")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


in testcase: trinity
version: trinity-static-i386-x86_64-f93256fb_2019-08-28
with following parameters:

	group: group-04

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+--------------------------------------------------------------+------------+------------+
|                                                              | 5e59fba573 | 7308e02404 |
+--------------------------------------------------------------+------------+------------+
| boot_successes                                               | 134        | 51         |
| boot_failures                                                | 0          | 13         |
| INFO:rcu_preempt_self-detected_stall_on_CPU                  | 0          | 3          |
| EIP:lock_is_held_type                                        | 0          | 1          |
| EIP:_raw_write_unlock_irq                                    | 0          | 1          |
| EIP:_raw_spin_unlock_irq                                     | 0          | 6          |
| EIP:perf_lock_task_context                                   | 0          | 2          |
| BUG:kernel_hang_in_test_stage                                | 0          | 6          |
| WARNING:at_kernel/rcu/tree_stall.h:#rcu_check_gp_start_stall | 0          | 2          |
| EIP:rcu_check_gp_start_stall                                 | 0          | 2          |
| INFO:rcu_preempt_detected_stalls_on_CPUs/tasks               | 0          | 4          |
| EIP:queued_spin_lock_slowpath                                | 0          | 3          |
| EIP:_raw_spin_unlock_irqrestore                              | 0          | 3          |
| WARNING:inconsistent_lock_state                              | 0          | 4          |
| inconsistent{IN-HARDIRQ-W}->{HARDIRQ-ON-W}usage              | 0          | 4          |
| EIP:preempt_schedule_common                                  | 0          | 2          |
| BUG:scheduling_while_atomic                                  | 0          | 4          |
| EIP:schedule                                                 | 0          | 1          |
| EIP:hlock_class                                              | 0          | 1          |
| EIP:__xapic_wait_icr_idle                                    | 0          | 1          |
| EIP:rcu_preempt_deferred_qs_irqrestore                       | 0          | 1          |
+--------------------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[  380.783397] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  380.783736] 	(detected by 1, t=10502 jiffies, g=185097, q=3169)
[  380.784177] rcu: All QSes seen, last rcu_preempt kthread activity 4119 (8021-3902), jiffies_till_next_fqs=1, root ->qsmask 0x0
[  380.784706] rcu: rcu_preempt kthread starved for 4119 jiffies! g185097 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[  380.785185] rcu: 	Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[  380.785607] rcu: RCU grace-period kthread stack dump:
[  380.785845] task:rcu_preempt     state:R  running task     stack:    0 pid:   14 ppid:     2 flags:0x00004008
[  380.786307] Call Trace:
[  380.786428] __schedule (kbuild/src/consumer/kernel/sched/core.c:4327 kbuild/src/consumer/kernel/sched/core.c:5075) 
[  380.786598] schedule (kbuild/src/consumer/arch/x86/include/asm/preempt.h:85 (discriminator 1) kbuild/src/consumer/kernel/sched/core.c:5155 (discriminator 1)) 
[  380.786751] schedule_timeout (kbuild/src/consumer/kernel/time/timer.c:1893) 
[  380.786948] ? __next_timer_interrupt (kbuild/src/consumer/kernel/time/timer.c:1816) 
[  380.787156] rcu_gp_kthread (kbuild/src/consumer/kernel/rcu/tree.c:2005 kbuild/src/consumer/kernel/rcu/tree.c:2178) 
[  380.787333] ? strict_work_handler (kbuild/src/consumer/kernel/rcu/tree.c:2153) 
[  380.787532] kthread (kbuild/src/consumer/kernel/kthread.c:294) 
[  380.787681] ? strict_work_handler (kbuild/src/consumer/kernel/rcu/tree.c:2153) 
[  380.787879] ? test_bit (kbuild/src/consumer/arch/x86/include/asm/msr.h:124) 
[  380.788074] ret_from_fork (kbuild/src/consumer/arch/x86/entry/entry_32.S:856) 
[  380.788241] rcu: Stack dump where RCU GP kthread last ran:
[  380.788492] Sending NMI from CPU 1 to CPUs 0:
[  380.789729] NMI backtrace for cpu 0
[  380.789730] CPU: 0 PID: 2497 Comm: getty Not tainted 5.12.0-rc2-00004-g7308e0240410 #1
[  380.789731] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[  380.789731] EIP: queued_spin_lock_slowpath (kbuild/src/consumer/arch/x86/include/asm/vdso/processor.h:19 kbuild/src/consumer/arch/x86/include/asm/qspinlock.h:100 kbuild/src/consumer/kernel/locking/qspinlock.c:326) 
[ 380.789732] Code: bb 01 00 00 00 7f 1b 81 fa 00 01 00 00 75 2c b8 01 02 00 00 eb 1b f0 0f b1 19 85 c0 0f 84 6c 01 00 00 8b 01 85 c0 74 ee f3 90 <eb> f6 48 74 0c f3 90 8b 11 81 fa 00 01 00 00 74 f1 81 e2 00 ff ff
All code
========
   0:	bb 01 00 00 00       	mov    $0x1,%ebx
   5:	7f 1b                	jg     0x22
   7:	81 fa 00 01 00 00    	cmp    $0x100,%edx
   d:	75 2c                	jne    0x3b
   f:	b8 01 02 00 00       	mov    $0x201,%eax
  14:	eb 1b                	jmp    0x31
  16:	f0 0f b1 19          	lock cmpxchg %ebx,(%rcx)
  1a:	85 c0                	test   %eax,%eax
  1c:	0f 84 6c 01 00 00    	je     0x18e
  22:	8b 01                	mov    (%rcx),%eax
  24:	85 c0                	test   %eax,%eax
  26:	74 ee                	je     0x16
  28:	f3 90                	pause  
  2a:*	eb f6                	jmp    0x22		<-- trapping instruction
  2c:	48 74 0c             	rex.W je 0x3b
  2f:	f3 90                	pause  
  31:	8b 11                	mov    (%rcx),%edx
  33:	81 fa 00 01 00 00    	cmp    $0x100,%edx
  39:	74 f1                	je     0x2c
  3b:	81                   	.byte 0x81
  3c:	e2 00                	loop   0x3e
  3e:	ff                   	(bad)  
  3f:	ff                   	.byte 0xff

Code starting with the faulting instruction
===========================================
   0:	eb f6                	jmp    0xfffffffffffffff8
   2:	48 74 0c             	rex.W je 0x11
   5:	f3 90                	pause  
   7:	8b 11                	mov    (%rcx),%edx
   9:	81 fa 00 01 00 00    	cmp    $0x100,%edx
   f:	74 f1                	je     0x2
  11:	81                   	.byte 0x81
  12:	e2 00                	loop   0x14
  14:	ff                   	(bad)  
  15:	ff                   	.byte 0xff
[  380.789732] EAX: 00000001 EBX: 00000001 ECX: b7a5a300 EDX: 00000001
[  380.789733] ESI: 00000046 EDI: dae55300 EBP: dddafda0 ESP: dddafd94
[  380.789733] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000002
[  380.789734] CR0: 80050033 CR2: 0177a77c CR3: 2de0e000 CR4: 00040690
[  380.789734] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  380.789735] DR6: fffe0ff0 DR7: 00000400
[  380.789735] Call Trace:
[  380.789735] do_raw_spin_lock (kbuild/src/consumer/include/asm-generic/qspinlock.h:85 kbuild/src/consumer/kernel/locking/spinlock_debug.c:113) 
[  380.789735] _raw_spin_lock_irqsave (kbuild/src/consumer/kernel/locking/spinlock.c:160) 
[  380.789736] ? rcu_report_qs_rdp (kbuild/src/consumer/kernel/rcu/tree.c:2336 (discriminator 13)) 
[  380.789736] rcu_report_qs_rdp (kbuild/src/consumer/kernel/rcu/tree.c:2336 (discriminator 13)) 
[  380.789736] rcu_preempt_deferred_qs_irqrestore (kbuild/src/consumer/kernel/rcu/tree_plugin.h:465) 
[  380.789737] rcu_preempt_deferred_qs (kbuild/src/consumer/kernel/rcu/tree_plugin.h:580) 
[  380.789737] rcu_note_context_switch (kbuild/src/consumer/kernel/rcu/tree_plugin.h:333) 
[  380.789738] __schedule (kbuild/src/consumer/kernel/sched/core.c:4992 (discriminator 3)) 
[  380.789738] ? exc_nmi (kbuild/src/consumer/arch/x86/kernel/irq_work.c:17) 
[  380.789738] preempt_schedule_irq (kbuild/src/consumer/arch/x86/include/asm/irqflags.h:29 kbuild/src/consumer/arch/x86/include/asm/irqflags.h:70 kbuild/src/consumer/arch/x86/include/asm/irqflags.h:137 kbuild/src/consumer/kernel/sched/core.c:5533) 
[  380.789739] irqentry_exit_cond_resched (kbuild/src/consumer/kernel/entry/common.c:394) 
[  380.789739] irqentry_exit (kbuild/src/consumer/kernel/entry/common.c:432) 
[  380.789739] sysvec_irq_work (kbuild/src/consumer/arch/x86/kernel/irq_work.c:17) 
[  380.789740] handle_exception (kbuild/src/consumer/arch/x86/entry/entry_32.S:1179) 
[  380.789740] EIP: _raw_spin_unlock_irqrestore (kbuild/src/consumer/include/linux/spinlock_api_smp.h:161 kbuild/src/consumer/kernel/locking/spinlock.c:191) 
[ 380.789741] Code: 8f 68 6c ff 81 e3 00 02 00 00 74 05 e8 e4 99 71 ff 9c 58 0f ba e0 09 73 05 e8 4a 9d ff ff 85 db 74 01 fb 64 ff 0d 0c 08 ca b7 <75> 05 e8 df 1c 00 00 5b 5e 5d c3 55 89 e5 53 64 ff 05 0c 08 ca b7
All code
========
   0:	8f 68 6c ff          	(bad)  
   4:	81 e3 00 02 00 00    	and    $0x200,%ebx
   a:	74 05                	je     0x11
   c:	e8 e4 99 71 ff       	callq  0xffffffffff7199f5
  11:	9c                   	pushfq 
  12:	58                   	pop    %rax
  13:	0f ba e0 09          	bt     $0x9,%eax
  17:	73 05                	jae    0x1e
  19:	e8 4a 9d ff ff       	callq  0xffffffffffff9d68
  1e:	85 db                	test   %ebx,%ebx
  20:	74 01                	je     0x23
  22:	fb                   	sti    
  23:	64 ff 0d 0c 08 ca b7 	decl   %fs:-0x4835f7f4(%rip)        # 0xffffffffb7ca0836
  2a:*	75 05                	jne    0x31		<-- trapping instruction
  2c:	e8 df 1c 00 00       	callq  0x1d10
  31:	5b                   	pop    %rbx
  32:	5e                   	pop    %rsi
  33:	5d                   	pop    %rbp
  34:	c3                   	retq   
  35:	55                   	push   %rbp
  36:	89 e5                	mov    %esp,%ebp
  38:	53                   	push   %rbx
  39:	64 ff 05 0c 08 ca b7 	incl   %fs:-0x4835f7f4(%rip)        # 0xffffffffb7ca084c

Code starting with the faulting instruction
===========================================
   0:	75 05                	jne    0x7
   2:	e8 df 1c 00 00       	callq  0x1ce6
   7:	5b                   	pop    %rbx
   8:	5e                   	pop    %rsi
   9:	5d                   	pop    %rbp
   a:	c3                   	retq   
   b:	55                   	push   %rbp
   c:	89 e5                	mov    %esp,%ebp
   e:	53                   	push   %rbx
   f:	64 ff 05 0c 08 ca b7 	incl   %fs:-0x4835f7f4(%rip)        # 0xffffffffb7ca0822


To reproduce:

        # build kernel
	cd linux
	cp config-5.12.0-rc2-00004-g7308e0240410 .config
	make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.12.0-rc2-00004-g7308e0240410" of type "text/plain" (137169 bytes)

View attachment "job-script" of type "text/plain" (4184 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (20164 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ