[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210315140441.GA4401@xsang-OptiPlex-9020>
Date: Mon, 15 Mar 2021 22:04:41 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Petr Mladek <pmladek@...e.com>
Cc: 0day robot <lkp@...el.com>, LKML <linux-kernel@...r.kernel.org>,
lkp@...ts.01.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Laurence Oberman <loberman@...hat.com>,
Vincent Whitchurch <vincent.whitchurch@...s.com>,
Michal Hocko <mhocko@...e.com>, Petr Mladek <pmladek@...e.com>
Subject: c928e9b143: BUG:soft_lockup-CPU##stuck_for#s![perf:#]
Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: c928e9b1439de4d74b942abd30d5c838a40af777 ("[PATCH v2 7/7] Test softlockup")
url: https://github.com/0day-ci/linux/commits/Petr-Mladek/watchdog-softlockup-Report-overall-time-and-some-cleanup/20210311-205501
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git a74e6a014c9d4d4161061f770c9b4f98372ac778
in testcase: stress-ng
version: stress-ng-x86_64-0.11-06_20210314
with following parameters:
nr_threads: 10%
disk: 1HDD
testtime: 60s
fs: ext4
class: filesystem
test: binderfs
cpufreq_governor: performance
ucode: 0x42e
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
[ 70.666742] watchdog: BUG: soft lockup - CPU#8 stuck for 26s! [perf:1794]
[ 70.675062] Modules linked in: dm_mod xfs libcrc32c sd_mod t10_pi sg intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass mgag200 crct10dif_pclmul crc32_pclmul crc32c_intel drm_kms_helper ghash_clmulni_intel isci syscopyarea sysfillrect rapl sysimgblt libsas fb_sys_fops ahci intel_cstate ipmi_si scsi_transport_sas libahci mei_me ipmi_devintf ioatdma drm intel_uncore ipmi_msghandler libata mei joydev dca wmi ip_tables
[ 70.725024] CPU: 8 PID: 1794 Comm: perf Not tainted 5.12.0-rc2-00303-gc928e9b1439d #1
[ 70.734501] Hardware name: Intel Corporation S2600WP/S2600WP, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[ 70.746688] RIP: 0010:version_proc_show (kbuild/src/consumer/fs/proc/version.c:15)
[ 70.752690] Code: c3 0f 1f 44 00 00 55 48 c7 c6 00 dc 24 82 48 89 fd 48 c7 c7 a8 ed 57 82 e8 af 5d ff ff c6 05 90 60 ba 01 01 8a 05 8a 60 ba 01 <84> c0 74 04 f3 90 eb f2 65 48 8b 04 25 00 6f 01 00 48 8b 80 98 0b
All code
========
0: c3 retq
1: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
6: 55 push %rbp
7: 48 c7 c6 00 dc 24 82 mov $0xffffffff8224dc00,%rsi
e: 48 89 fd mov %rdi,%rbp
11: 48 c7 c7 a8 ed 57 82 mov $0xffffffff8257eda8,%rdi
18: e8 af 5d ff ff callq 0xffffffffffff5dcc
1d: c6 05 90 60 ba 01 01 movb $0x1,0x1ba6090(%rip) # 0x1ba60b4
24: 8a 05 8a 60 ba 01 mov 0x1ba608a(%rip),%al # 0x1ba60b4
2a:* 84 c0 test %al,%al <-- trapping instruction
2c: 74 04 je 0x32
2e: f3 90 pause
30: eb f2 jmp 0x24
32: 65 48 8b 04 25 00 6f mov %gs:0x16f00,%rax
39: 01 00
3b: 48 rex.W
3c: 8b .byte 0x8b
3d: 80 .byte 0x80
3e: 98 cwtl
3f: 0b .byte 0xb
Code starting with the faulting instruction
===========================================
0: 84 c0 test %al,%al
2: 74 04 je 0x8
4: f3 90 pause
6: eb f2 jmp 0xfffffffffffffffa
8: 65 48 8b 04 25 00 6f mov %gs:0x16f00,%rax
f: 01 00
11: 48 rex.W
12: 8b .byte 0x8b
13: 80 .byte 0x80
14: 98 cwtl
15: 0b .byte 0xb
[ 70.775248] RSP: 0018:ffffc9000b84bdd0 EFLAGS: 00000202
[ 70.781821] RAX: 0000000000000001 RBX: ffff888111af7ca8 RCX: 0000000000000000
[ 70.790549] RDX: 0000000000000000 RSI: ffff888f02a177f0 RDI: ffff888f02a177f0
[ 70.799294] RBP: ffff888111af7ca8 R08: ffff888f02a177f0 R09: ffffc9000b84bbf0
[ 70.808019] R10: 0000000000000001 R11: 0000000000000001 R12: ffffc9000b84be88
[ 70.816751] R13: ffffc9000b84be60 R14: ffff888111af7cd0 R15: 0000000000000001
[ 70.825493] FS: 00007f3b1f0397c0(0000) GS:ffff888f02a00000(0000) knlGS:0000000000000000
[ 70.835310] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 70.842505] CR2: 00005650280be178 CR3: 0000000194ba2004 CR4: 00000000001706e0
[ 70.851261] Call Trace:
[ 70.854739] seq_read_iter (kbuild/src/consumer/fs/seq_file.c:227)
[ 70.859674] proc_reg_read_iter (kbuild/src/consumer/fs/proc/inode.c:311)
[ 70.864901] new_sync_read (kbuild/src/consumer/fs/read_write.c:416 (discriminator 1))
[ 70.869838] vfs_read (kbuild/src/consumer/fs/read_write.c:496)
[ 70.874326] ksys_read (kbuild/src/consumer/fs/read_write.c:634)
[ 70.878653] do_syscall_64 (kbuild/src/consumer/arch/x86/entry/common.c:46)
[ 70.883389] entry_SYSCALL_64_after_hwframe (kbuild/src/consumer/arch/x86/entry/entry_64.S:112)
[ 70.889715] RIP: 0033:0x7f3b1fcd5461
[ 70.894432] Code: fe ff ff 50 48 8d 3d fe d0 09 00 e8 e9 03 02 00 66 0f 1f 84 00 00 00 00 00 48 8d 05 99 62 0d 00 8b 00 85 c0 75 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 41 54 49 89 d4 55 48
All code
========
0: fe (bad)
1: ff (bad)
2: ff 50 48 callq *0x48(%rax)
5: 8d 3d fe d0 09 00 lea 0x9d0fe(%rip),%edi # 0x9d109
b: e8 e9 03 02 00 callq 0x203f9
10: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
17: 00 00
19: 48 8d 05 99 62 0d 00 lea 0xd6299(%rip),%rax # 0xd62b9
20: 8b 00 mov (%rax),%eax
22: 85 c0 test %eax,%eax
24: 75 13 jne 0x39
26: 31 c0 xor %eax,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 57 ja 0x89
32: c3 retq
33: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
39: 41 54 push %r12
3b: 49 89 d4 mov %rdx,%r12
3e: 55 push %rbp
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 57 ja 0x5f
8: c3 retq
9: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
f: 41 54 push %r12
11: 49 89 d4 mov %rdx,%r12
14: 55 push %rbp
15: 48 rex.W
[ 70.916824] RSP: 002b:00007fffa9898db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 70.922741] watchdog: BUG: soft lockup - CPU#27 stuck for 27s! [perf:1882]
[ 70.925999] RAX: ffffffffffffffda RBX: 0000565027ffc970 RCX: 00007f3b1fcd5461
[ 70.934426] Modules linked in: dm_mod
[ 70.943145] RDX: 0000000000000400 RSI: 0000565027ffcc20 RDI: 0000000000000004
[ 70.943147] RBP: 0000000000000d68 R08: 0000000000000001 R09: 0000000000000000
[ 70.947963] xfs
[ 70.956663] R10: 00007f3b1f0397c0 R11: 0000000000000246 R12: 00007f3b1fda2760
[ 70.965389] libcrc32c
[ 70.968210] R13: 00007f3b1fda32a0 R14: 0000000000000fff R15: 0000565027ffc970
[ 70.976908] sd_mod
[ 70.980313] Kernel panic - not syncing: softlockup: hung tasks
[ 70.989033] t10_pi
[ 70.992181] CPU: 8 PID: 1794 Comm: perf Tainted: G L 5.12.0-rc2-00303-gc928e9b1439d #1
[ 70.999457] sg
[ 71.002521] Hardware name: Intel Corporation S2600WP/S2600WP, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[ 71.013538] intel_rapl_msr
[ 71.016248] Call Trace:
[ 71.028495] intel_rapl_common
[ 71.032402] <IRQ>
[ 71.035857] sb_edac
[ 71.039990] dump_stack (kbuild/src/consumer/lib/dump_stack.c:122)
[ 71.042930] x86_pkg_temp_thermal
[ 71.046073] panic (kbuild/src/consumer/kernel/panic.c:249)
[ 71.050431] intel_powerclamp
[ 71.054752] watchdog_timer_fn.cold (kbuild/src/consumer/kernel/watchdog.c:433)
[ 71.058764] coretemp
[ 71.062645] ? softlockup_fn (kbuild/src/consumer/kernel/watchdog.c:354)
[ 71.068018] kvm_intel
[ 71.071163] __hrtimer_run_queues (kbuild/src/consumer/kernel/time/hrtimer.c:1519 kbuild/src/consumer/kernel/time/hrtimer.c:1583)
[ 71.075919] kvm
[ 71.079161] hrtimer_interrupt (kbuild/src/consumer/kernel/time/hrtimer.c:1648)
[ 71.084497] irqbypass
[ 71.087153] __sysvec_apic_timer_interrupt (kbuild/src/consumer/arch/x86/include/asm/jump_label.h:25 kbuild/src/consumer/include/linux/jump_label.h:200 kbuild/src/consumer/arch/x86/include/asm/trace/irq_vectors.h:41 kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1107)
[ 71.092326] mgag200
[ 71.095499] sysvec_apic_timer_interrupt (kbuild/src/consumer/arch/x86/kernel/apic/apic.c:1100 (discriminator 14))
[ 71.101652] crct10dif_pclmul
[ 71.104609] </IRQ>
[ 71.110501] crc32_pclmul
[ 71.114365] asm_sysvec_apic_timer_interrupt (kbuild/src/consumer/arch/x86/include/asm/idtentry.h:632)
[ 71.117264] crc32c_intel
[ 71.120677] RIP: 0010:version_proc_show (kbuild/src/consumer/fs/proc/version.c:15)
[ 71.126951] drm_kms_helper
[ 71.130406] Code: c3 0f 1f 44 00 00 55 48 c7 c6 00 dc 24 82 48 89 fd 48 c7 c7 a8 ed 57 82 e8 af 5d ff ff c6 05 90 60 ba 01 01 8a 05 8a 60 ba 01 <84> c0 74 04 f3 90 eb f2 65 48 8b 04 25 00 6f 01 00 48 8b 80 98 0b
All code
========
0: c3 retq
1: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
6: 55 push %rbp
7: 48 c7 c6 00 dc 24 82 mov $0xffffffff8224dc00,%rsi
e: 48 89 fd mov %rdi,%rbp
11: 48 c7 c7 a8 ed 57 82 mov $0xffffffff8257eda8,%rdi
18: e8 af 5d ff ff callq 0xffffffffffff5dcc
1d: c6 05 90 60 ba 01 01 movb $0x1,0x1ba6090(%rip) # 0x1ba60b4
24: 8a 05 8a 60 ba 01 mov 0x1ba608a(%rip),%al # 0x1ba60b4
2a:* 84 c0 test %al,%al <-- trapping instruction
2c: 74 04 je 0x32
2e: f3 90 pause
30: eb f2 jmp 0x24
32: 65 48 8b 04 25 00 6f mov %gs:0x16f00,%rax
39: 01 00
3b: 48 rex.W
3c: 8b .byte 0x8b
3d: 80 .byte 0x80
3e: 98 cwtl
3f: 0b .byte 0xb
Code starting with the faulting instruction
===========================================
0: 84 c0 test %al,%al
2: 74 04 je 0x8
4: f3 90 pause
6: eb f2 jmp 0xfffffffffffffffa
8: 65 48 8b 04 25 00 6f mov %gs:0x16f00,%rax
f: 01 00
11: 48 rex.W
12: 8b .byte 0x8b
13: 80 .byte 0x80
14: 98 cwtl
15: 0b .byte 0xb
[ 71.136243] ghash_clmulni_intel
[ 71.139898] RSP: 0018:ffffc9000b84bdd0 EFLAGS: 00000202
[ 71.162013] isci
[ 71.166220]
[ 71.172580] syscopyarea
[ 71.175291] RAX: 0000000000000001 RBX: ffff888111af7ca8 RCX: 0000000000000000
[ 71.177465] sysfillrect
[ 71.180792] RDX: 0000000000000000 RSI: ffff888f02a177f0 RDI: ffff888f02a177f0
[ 71.189291] rapl
[ 71.192595] RBP: ffff888111af7ca8 R08: ffff888f02a177f0 R09: ffffc9000b84bbf0
[ 71.201111] sysimgblt
[ 71.203764] R10: 0000000000000001 R11: 0000000000000001 R12: ffffc9000b84be88
[ 71.212307] libsas
[ 71.215453] R13: ffffc9000b84be60 R14: ffff888111af7cd0 R15: 0000000000000001
[ 71.223956] fb_sys_fops
[ 71.226799] seq_read_iter (kbuild/src/consumer/fs/seq_file.c:227)
[ 71.235309] ahci
[ 71.238628] proc_reg_read_iter (kbuild/src/consumer/fs/proc/inode.c:311)
[ 71.243337] intel_cstate
[ 71.246008] new_sync_read (kbuild/src/consumer/fs/read_write.c:416 (discriminator 1))
[ 71.250990] ipmi_si
[ 71.254390] vfs_read (kbuild/src/consumer/fs/read_write.c:496)
[ 71.259089] scsi_transport_sas
[ 71.262024] ksys_read (kbuild/src/consumer/fs/read_write.c:634)
[ 71.266253] libahci
[ 71.270260] do_syscall_64 (kbuild/src/consumer/arch/x86/entry/common.c:46)
[ 71.274349] mei_me
[ 71.277282] entry_SYSCALL_64_after_hwframe (kbuild/src/consumer/arch/x86/entry/entry_64.S:112)
[ 71.281743] ipmi_devintf
[ 71.284552] RIP: 0033:0x7f3b1fcd5461
[ 71.290647] ioatdma
[ 71.294094] Code: fe ff ff 50 48 8d 3d fe d0 09 00 e8 e9 03 02 00 66 0f 1f 84 00 00 00 00 00 48 8d 05 99 62 0d 00 8b 00 85 c0 75 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 41 54 49 89 d4 55 48
All code
========
0: fe (bad)
1: ff (bad)
2: ff 50 48 callq *0x48(%rax)
5: 8d 3d fe d0 09 00 lea 0x9d0fe(%rip),%edi # 0x9d109
b: e8 e9 03 02 00 callq 0x203f9
10: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
17: 00 00
19: 48 8d 05 99 62 0d 00 lea 0xd6299(%rip),%rax # 0xd62b9
20: 8b 00 mov (%rax),%eax
22: 85 c0 test %eax,%eax
24: 75 13 jne 0x39
26: 31 c0 xor %eax,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 57 ja 0x89
32: c3 retq
33: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
39: 41 54 push %r12
3b: 49 89 d4 mov %rdx,%r12
3e: 55 push %rbp
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 57 ja 0x5f
8: c3 retq
9: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
f: 41 54 push %r12
11: 49 89 d4 mov %rdx,%r12
14: 55 push %rbp
15: 48 rex.W
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks,
Oliver Sang
View attachment "config-5.12.0-rc2-00303-gc928e9b1439d" of type "text/plain" (172899 bytes)
View attachment "job-script" of type "text/plain" (8366 bytes)
Download attachment "dmesg.xz" of type "application/x-xz" (27364 bytes)
View attachment "job.yaml" of type "text/plain" (5524 bytes)
Powered by blists - more mailing lists