[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210407072934.GD22260@xsang-OptiPlex-9020>
Date: Wed, 7 Apr 2021 15:29:34 +0800
From: kernel test robot <oliver.sang@...el.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
lkp@...ts.01.org, lkp@...el.com
Subject: [rcu] 7308e02404: INFO:rcu_preempt_detected_stalls_on_CPUs/tasks
Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: 7308e0240410d3644c9d7cc6263079a58e3effeb ("rcu: Make rcu_read_unlock_special() expedite strict grace periods")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
in testcase: trinity
version: trinity-static-i386-x86_64-f93256fb_2019-08-28
with following parameters:
group: group-04
test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/
on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+--------------------------------------------------------------+------------+------------+
| | 5e59fba573 | 7308e02404 |
+--------------------------------------------------------------+------------+------------+
| boot_successes | 134 | 51 |
| boot_failures | 0 | 13 |
| INFO:rcu_preempt_self-detected_stall_on_CPU | 0 | 3 |
| EIP:lock_is_held_type | 0 | 1 |
| EIP:_raw_write_unlock_irq | 0 | 1 |
| EIP:_raw_spin_unlock_irq | 0 | 6 |
| EIP:perf_lock_task_context | 0 | 2 |
| BUG:kernel_hang_in_test_stage | 0 | 6 |
| WARNING:at_kernel/rcu/tree_stall.h:#rcu_check_gp_start_stall | 0 | 2 |
| EIP:rcu_check_gp_start_stall | 0 | 2 |
| INFO:rcu_preempt_detected_stalls_on_CPUs/tasks | 0 | 4 |
| EIP:queued_spin_lock_slowpath | 0 | 3 |
| EIP:_raw_spin_unlock_irqrestore | 0 | 3 |
| WARNING:inconsistent_lock_state | 0 | 4 |
| inconsistent{IN-HARDIRQ-W}->{HARDIRQ-ON-W}usage | 0 | 4 |
| EIP:preempt_schedule_common | 0 | 2 |
| BUG:scheduling_while_atomic | 0 | 4 |
| EIP:schedule | 0 | 1 |
| EIP:hlock_class | 0 | 1 |
| EIP:__xapic_wait_icr_idle | 0 | 1 |
| EIP:rcu_preempt_deferred_qs_irqrestore | 0 | 1 |
+--------------------------------------------------------------+------------+------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
[ 380.783397] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 380.783736] (detected by 1, t=10502 jiffies, g=185097, q=3169)
[ 380.784177] rcu: All QSes seen, last rcu_preempt kthread activity 4119 (8021-3902), jiffies_till_next_fqs=1, root ->qsmask 0x0
[ 380.784706] rcu: rcu_preempt kthread starved for 4119 jiffies! g185097 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 380.785185] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 380.785607] rcu: RCU grace-period kthread stack dump:
[ 380.785845] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00004008
[ 380.786307] Call Trace:
[ 380.786428] __schedule (kbuild/src/consumer/kernel/sched/core.c:4327 kbuild/src/consumer/kernel/sched/core.c:5075)
[ 380.786598] schedule (kbuild/src/consumer/arch/x86/include/asm/preempt.h:85 (discriminator 1) kbuild/src/consumer/kernel/sched/core.c:5155 (discriminator 1))
[ 380.786751] schedule_timeout (kbuild/src/consumer/kernel/time/timer.c:1893)
[ 380.786948] ? __next_timer_interrupt (kbuild/src/consumer/kernel/time/timer.c:1816)
[ 380.787156] rcu_gp_kthread (kbuild/src/consumer/kernel/rcu/tree.c:2005 kbuild/src/consumer/kernel/rcu/tree.c:2178)
[ 380.787333] ? strict_work_handler (kbuild/src/consumer/kernel/rcu/tree.c:2153)
[ 380.787532] kthread (kbuild/src/consumer/kernel/kthread.c:294)
[ 380.787681] ? strict_work_handler (kbuild/src/consumer/kernel/rcu/tree.c:2153)
[ 380.787879] ? test_bit (kbuild/src/consumer/arch/x86/include/asm/msr.h:124)
[ 380.788074] ret_from_fork (kbuild/src/consumer/arch/x86/entry/entry_32.S:856)
[ 380.788241] rcu: Stack dump where RCU GP kthread last ran:
[ 380.788492] Sending NMI from CPU 1 to CPUs 0:
[ 380.789729] NMI backtrace for cpu 0
[ 380.789730] CPU: 0 PID: 2497 Comm: getty Not tainted 5.12.0-rc2-00004-g7308e0240410 #1
[ 380.789731] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 380.789731] EIP: queued_spin_lock_slowpath (kbuild/src/consumer/arch/x86/include/asm/vdso/processor.h:19 kbuild/src/consumer/arch/x86/include/asm/qspinlock.h:100 kbuild/src/consumer/kernel/locking/qspinlock.c:326)
[ 380.789732] Code: bb 01 00 00 00 7f 1b 81 fa 00 01 00 00 75 2c b8 01 02 00 00 eb 1b f0 0f b1 19 85 c0 0f 84 6c 01 00 00 8b 01 85 c0 74 ee f3 90 <eb> f6 48 74 0c f3 90 8b 11 81 fa 00 01 00 00 74 f1 81 e2 00 ff ff
All code
========
0: bb 01 00 00 00 mov $0x1,%ebx
5: 7f 1b jg 0x22
7: 81 fa 00 01 00 00 cmp $0x100,%edx
d: 75 2c jne 0x3b
f: b8 01 02 00 00 mov $0x201,%eax
14: eb 1b jmp 0x31
16: f0 0f b1 19 lock cmpxchg %ebx,(%rcx)
1a: 85 c0 test %eax,%eax
1c: 0f 84 6c 01 00 00 je 0x18e
22: 8b 01 mov (%rcx),%eax
24: 85 c0 test %eax,%eax
26: 74 ee je 0x16
28: f3 90 pause
2a:* eb f6 jmp 0x22 <-- trapping instruction
2c: 48 74 0c rex.W je 0x3b
2f: f3 90 pause
31: 8b 11 mov (%rcx),%edx
33: 81 fa 00 01 00 00 cmp $0x100,%edx
39: 74 f1 je 0x2c
3b: 81 .byte 0x81
3c: e2 00 loop 0x3e
3e: ff (bad)
3f: ff .byte 0xff
Code starting with the faulting instruction
===========================================
0: eb f6 jmp 0xfffffffffffffff8
2: 48 74 0c rex.W je 0x11
5: f3 90 pause
7: 8b 11 mov (%rcx),%edx
9: 81 fa 00 01 00 00 cmp $0x100,%edx
f: 74 f1 je 0x2
11: 81 .byte 0x81
12: e2 00 loop 0x14
14: ff (bad)
15: ff .byte 0xff
[ 380.789732] EAX: 00000001 EBX: 00000001 ECX: b7a5a300 EDX: 00000001
[ 380.789733] ESI: 00000046 EDI: dae55300 EBP: dddafda0 ESP: dddafd94
[ 380.789733] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000002
[ 380.789734] CR0: 80050033 CR2: 0177a77c CR3: 2de0e000 CR4: 00040690
[ 380.789734] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 380.789735] DR6: fffe0ff0 DR7: 00000400
[ 380.789735] Call Trace:
[ 380.789735] do_raw_spin_lock (kbuild/src/consumer/include/asm-generic/qspinlock.h:85 kbuild/src/consumer/kernel/locking/spinlock_debug.c:113)
[ 380.789735] _raw_spin_lock_irqsave (kbuild/src/consumer/kernel/locking/spinlock.c:160)
[ 380.789736] ? rcu_report_qs_rdp (kbuild/src/consumer/kernel/rcu/tree.c:2336 (discriminator 13))
[ 380.789736] rcu_report_qs_rdp (kbuild/src/consumer/kernel/rcu/tree.c:2336 (discriminator 13))
[ 380.789736] rcu_preempt_deferred_qs_irqrestore (kbuild/src/consumer/kernel/rcu/tree_plugin.h:465)
[ 380.789737] rcu_preempt_deferred_qs (kbuild/src/consumer/kernel/rcu/tree_plugin.h:580)
[ 380.789737] rcu_note_context_switch (kbuild/src/consumer/kernel/rcu/tree_plugin.h:333)
[ 380.789738] __schedule (kbuild/src/consumer/kernel/sched/core.c:4992 (discriminator 3))
[ 380.789738] ? exc_nmi (kbuild/src/consumer/arch/x86/kernel/irq_work.c:17)
[ 380.789738] preempt_schedule_irq (kbuild/src/consumer/arch/x86/include/asm/irqflags.h:29 kbuild/src/consumer/arch/x86/include/asm/irqflags.h:70 kbuild/src/consumer/arch/x86/include/asm/irqflags.h:137 kbuild/src/consumer/kernel/sched/core.c:5533)
[ 380.789739] irqentry_exit_cond_resched (kbuild/src/consumer/kernel/entry/common.c:394)
[ 380.789739] irqentry_exit (kbuild/src/consumer/kernel/entry/common.c:432)
[ 380.789739] sysvec_irq_work (kbuild/src/consumer/arch/x86/kernel/irq_work.c:17)
[ 380.789740] handle_exception (kbuild/src/consumer/arch/x86/entry/entry_32.S:1179)
[ 380.789740] EIP: _raw_spin_unlock_irqrestore (kbuild/src/consumer/include/linux/spinlock_api_smp.h:161 kbuild/src/consumer/kernel/locking/spinlock.c:191)
[ 380.789741] Code: 8f 68 6c ff 81 e3 00 02 00 00 74 05 e8 e4 99 71 ff 9c 58 0f ba e0 09 73 05 e8 4a 9d ff ff 85 db 74 01 fb 64 ff 0d 0c 08 ca b7 <75> 05 e8 df 1c 00 00 5b 5e 5d c3 55 89 e5 53 64 ff 05 0c 08 ca b7
All code
========
0: 8f 68 6c ff (bad)
4: 81 e3 00 02 00 00 and $0x200,%ebx
a: 74 05 je 0x11
c: e8 e4 99 71 ff callq 0xffffffffff7199f5
11: 9c pushfq
12: 58 pop %rax
13: 0f ba e0 09 bt $0x9,%eax
17: 73 05 jae 0x1e
19: e8 4a 9d ff ff callq 0xffffffffffff9d68
1e: 85 db test %ebx,%ebx
20: 74 01 je 0x23
22: fb sti
23: 64 ff 0d 0c 08 ca b7 decl %fs:-0x4835f7f4(%rip) # 0xffffffffb7ca0836
2a:* 75 05 jne 0x31 <-- trapping instruction
2c: e8 df 1c 00 00 callq 0x1d10
31: 5b pop %rbx
32: 5e pop %rsi
33: 5d pop %rbp
34: c3 retq
35: 55 push %rbp
36: 89 e5 mov %esp,%ebp
38: 53 push %rbx
39: 64 ff 05 0c 08 ca b7 incl %fs:-0x4835f7f4(%rip) # 0xffffffffb7ca084c
Code starting with the faulting instruction
===========================================
0: 75 05 jne 0x7
2: e8 df 1c 00 00 callq 0x1ce6
7: 5b pop %rbx
8: 5e pop %rsi
9: 5d pop %rbp
a: c3 retq
b: 55 push %rbp
c: 89 e5 mov %esp,%ebp
e: 53 push %rbx
f: 64 ff 05 0c 08 ca b7 incl %fs:-0x4835f7f4(%rip) # 0xffffffffb7ca0822
To reproduce:
# build kernel
cd linux
cp config-5.12.0-rc2-00004-g7308e0240410 .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=i386 olddefconfig prepare modules_prepare bzImage
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.12.0-rc2-00004-g7308e0240410" of type "text/plain" (137169 bytes)
View attachment "job-script" of type "text/plain" (4184 bytes)
Download attachment "dmesg.xz" of type "application/x-xz" (20164 bytes)
Powered by blists - more mailing lists