[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171030064450.3gz365ofbwhr7elq@wfg-t540p.sh.intel.com>
Date: Mon, 30 Oct 2017 07:44:50 +0100
From: Fengguang Wu <fengguang.wu@...el.com>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: [migration_cpu_stop] WARNING: CPU: 0 PID: 11 at
arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x69/0x9e
On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
>Hi Linus,
>
>Up to now we see the below boot error/warnings when testing v4.14-rc6.
>
>They hit the RC release mainly due to various imperfections in 0day's
>auto bisection. So I manually list them here and CC the likely easy to
>debug ones to the corresponding maintainers in the followup emails.
>
>boot_successes: 4700
>boot_failures: 247
[...]
>WARNING:at_arch/x86/kernel/smp.c:#native_smp_send_reschedule: 1
That's bisected to an update of RCU torture parameter:
commit 492b95e59735998312f678d77a2d5fe20af6b0b9
Author: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
AuthorDate: Fri Apr 21 16:09:15 2017 -0700
Commit: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
CommitDate: Thu Jun 8 08:25:31 2017 -0700
rcuperf: Set more user-friendly defaults
Common-case use of rcuperf must set rcuperf.nreaders=0 and if not built
as a module, rcuperf.shutdown. This commit therefore sets the default
for rcuperf.nreaders to zero and sets the default for rcuperf.shutdown
to zero if rcuperf is built as a module and to one otherwise.
Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
---
kernel/rcu/rcuperf.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index 5158ddba67162..49c8ed6bd2fda 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -63,9 +63,10 @@ torture_param(bool, gp_async, false, "Use asynchronous GP wait primitives");
torture_param(int, gp_async_max, 1000, "Max # outstanding waits per reader");
torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
-torture_param(int, nreaders, -1, "Number of RCU reader threads");
+torture_param(int, nreaders, 0, "Number of RCU reader threads");
torture_param(int, nwriters, -1, "Number of RCU updater threads");
-torture_param(bool, shutdown, false, "Shutdown at end of performance tests.");
+torture_param(bool, shutdown, !IS_ENABLED(MODULE),
+ "Shutdown at end of performance tests.");
torture_param(bool, verbose, true, "Enable verbose debugging printk()s");
static char *perf_type = "rcu";
Attached the dmesg. Here are the call traces -- note that the first
WARNING only shows up in latest mainline tree, so might be irrelevant
to the above commit.
[ 24.691613] rcu-perf: 1 writer-duration: 97 13680809
[ 24.692466] rcu-perf: 1 writer-duration: 98 9561892
[ 24.693192] rcu-perf: 1 writer-duration: 99 7970345
[ 24.693816] rcu-perf: 1 writer-duration: 100 9788023
[ 24.700210] no ifx modem active;
[ 24.701420] ACPI: Preparing to enter system sleep state S5
[ 24.704534] reboot: Power down
[ 24.705645] acpi_power_off called
[ 24.713350] ------------[ cut here ]------------
[ 24.713804] WARNING: CPU: 0 PID: 11 at kernel/sched/core.c:1179 set_task_cpu+0x157/0x285
[ 24.714779] Modules linked in: i2c_piix4 input_leds psmouse i2c_core button
[ 24.715437] CPU: 0 PID: 11 Comm: migration/0 Not tainted 4.14.0-rc6-00077-g11224e1 #1
[ 24.716188] task: ffff880032955000 task.stack: ffff88003294c000
[ 24.716757] RIP: 0010:set_task_cpu+0x157/0x285
[ 24.717193] RSP: 0018:ffff88003294fe08 EFLAGS: 00010002
[ 24.717746] RAX: 0000000000000001 RBX: ffff880031449000 RCX: 0000000000000000
[ 24.718416] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff82e2ad60
[ 24.719091] RBP: ffff88003294fe20 R08: 0000000000000000 R09: 0000000000000000
[ 24.719789] R10: ffff880014483b60 R11: 0000000000000002 R12: 0000000000000001
[ 24.720477] R13: 0000000000000001 R14: ffff88003294fe68 R15: ffff8800347eb300
[ 24.721155] FS: 0000000000000000(0000) GS:ffff880034600000(0000) knlGS:0000000000000000
[ 24.721947] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 24.722479] CR2: 00007f1cdeec8fd0 CR3: 000000003722d000 CR4: 00000000000006b0
[ 24.723144] Call Trace:
[ 24.723388] move_queued_task+0x8c/0x108
[ 24.723758] migration_cpu_stop+0x82/0xb7
[ 24.724131] ? sched_ttwu_pending+0x78/0x78
[ 24.724748] ? sort_range+0x17/0x17
[ 24.725214] cpu_stopper_thread+0x87/0x14c
[ 24.725785] ? sort_range+0x17/0x17
[ 24.726194] smpboot_thread_fn+0x2a9/0x2c0
[ 24.726718] kthread+0xf8/0x100
[ 24.727113] ? kthread_unpark+0x5a/0x5a
[ 24.727532] ret_from_fork+0x25/0x30
[ 24.727856] Code: 0f a3 2d 85 d2 e3 01 41 0f 92 c5 41 83 f5 01 31 c9 31 d2 45 0f b6 ed 48 c7 c7 60 ad e2 82 44 89 ee e8 53 24 08 00 45 85 ed 74 02 <0f> ff 31 c9 31 d2 44 89 ee 48 c7 c7 30 ad e2 82 45 31 ed e8 36
[ 24.729721] ---[ end trace b7ae64ce6fa90a4f ]---
[ 24.730154] sched: Unexpected reschedule of offline CPU#1!
[ 24.730700] ------------[ cut here ]------------
[ 24.731133] WARNING: CPU: 0 PID: 11 at arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x69/0x9e
[ 24.732162] Modules linked in: i2c_piix4 input_leds psmouse i2c_core button
[ 24.732859] CPU: 0 PID: 11 Comm: migration/0 Tainted: G W 4.14.0-rc6-00077-g11224e1 #1
[ 24.733690] task: ffff880032955000 task.stack: ffff88003294c000
[ 24.734449] RIP: 0010:native_smp_send_reschedule+0x69/0x9e
[ 24.735158] RSP: 0018:ffff88003294fdf8 EFLAGS: 00010092
[ 24.735704] RAX: 000000000000002e RBX: 0000000000000001 RCX: 0000000000000000
[ 24.736382] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff82e32ef0
[ 24.737009] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
[ 24.737711] R10: ffff88003294fd30 R11: 0000000000000005 R12: ffff880031449000
[ 24.738370] R13: 0000000000000000 R14: ffff88003294fe68 R15: ffff8800347eb300
[ 24.738997] FS: 0000000000000000(0000) GS:ffff880034600000(0000) knlGS:0000000000000000
[ 24.739754] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 24.740261] CR2: 00007f1cdeec8fd0 CR3: 000000003722d000 CR4: 00000000000006b0
[ 24.740921] Call Trace:
[ 24.741144] check_preempt_curr+0x47/0xc1
[ 24.741534] move_queued_task+0xfa/0x108
[ 24.741886] migration_cpu_stop+0x82/0xb7
[ 24.742240] ? sched_ttwu_pending+0x78/0x78
[ 24.742649] ? sort_range+0x17/0x17
[ 24.742974] cpu_stopper_thread+0x87/0x14c
[ 24.743401] ? sort_range+0x17/0x17
[ 24.743731] smpboot_thread_fn+0x2a9/0x2c0
[ 24.744111] kthread+0xf8/0x100
[ 24.744414] ? kthread_unpark+0x5a/0x5a
[ 24.744769] ret_from_fork+0x25/0x30
[ 24.745116] Code: ed 74 44 b9 01 00 00 00 31 d2 be 01 00 00 00 48 c7 c7 50 c4 e1 82 e8 e4 d4 0d 00 89 de 48 c7 c7 76 4c 86 81 31 c0 e8 e1 5e 08 00 <0f> ff 5b 5d b9 01 00 00 00 31 d2 be 01 00 00 00 48 c7 c7 20 c4
[ 24.746913] ---[ end trace b7ae64ce6fa90a50 ]---
Elapsed time: 2040
BUG: kernel hang in test stage
Thanks,
Fengguang
View attachment "dmesg-vm-vp-1G-1:20171028170943:x86_64-randconfig-n0-10180128:4.14.0-rc6-00077-g11224e1:1" of type "text/plain" (105859 bytes)
Powered by blists - more mailing lists