lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 30 Oct 2017 07:44:50 +0100
From:   Fengguang Wu <fengguang.wu@...el.com>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: [migration_cpu_stop] WARNING: CPU: 0 PID: 11 at
 arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x69/0x9e

On Sun, Oct 29, 2017 at 11:51:55PM +0100, Fengguang Wu wrote:
>Hi Linus,
>
>Up to now we see the below boot error/warnings when testing v4.14-rc6.
>
>They hit the RC release mainly due to various imperfections in 0day's
>auto bisection. So I manually list them here and CC the likely easy to
>debug ones to the corresponding maintainers in the followup emails.
>
>boot_successes: 4700
>boot_failures: 247

[...]

>WARNING:at_arch/x86/kernel/smp.c:#native_smp_send_reschedule: 1

That's bisected to an update of RCU torture parameter:

commit 492b95e59735998312f678d77a2d5fe20af6b0b9
Author:     Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
AuthorDate: Fri Apr 21 16:09:15 2017 -0700
Commit:     Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
CommitDate: Thu Jun 8 08:25:31 2017 -0700

    rcuperf: Set more user-friendly defaults

    Common-case use of rcuperf must set rcuperf.nreaders=0 and if not built
    as a module, rcuperf.shutdown.  This commit therefore sets the default
    for rcuperf.nreaders to zero and sets the default for rcuperf.shutdown
    to zero if rcuperf is built as a module and to one otherwise.

    Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
---
 kernel/rcu/rcuperf.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index 5158ddba67162..49c8ed6bd2fda 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -63,9 +63,10 @@ torture_param(bool, gp_async, false, "Use asynchronous GP wait primitives");
 torture_param(int, gp_async_max, 1000, "Max # outstanding waits per reader");
 torture_param(bool, gp_exp, false, "Use expedited GP wait primitives");
 torture_param(int, holdoff, 10, "Holdoff time before test start (s)");
-torture_param(int, nreaders, -1, "Number of RCU reader threads");
+torture_param(int, nreaders, 0, "Number of RCU reader threads");
 torture_param(int, nwriters, -1, "Number of RCU updater threads");
-torture_param(bool, shutdown, false, "Shutdown at end of performance tests.");
+torture_param(bool, shutdown, !IS_ENABLED(MODULE),
+             "Shutdown at end of performance tests.");
 torture_param(bool, verbose, true, "Enable verbose debugging printk()s");

 static char *perf_type = "rcu";

Attached the dmesg. Here are the call traces -- note that the first
WARNING only shows up in latest mainline tree, so might be irrelevant
to the above commit.

[   24.691613] rcu-perf:    1 writer-duration:    97 13680809
[   24.692466] rcu-perf:    1 writer-duration:    98 9561892
[   24.693192] rcu-perf:    1 writer-duration:    99 7970345
[   24.693816] rcu-perf:    1 writer-duration:   100 9788023
[   24.700210] no ifx modem active;
[   24.701420] ACPI: Preparing to enter system sleep state S5
[   24.704534] reboot: Power down
[   24.705645] acpi_power_off called
[   24.713350] ------------[ cut here ]------------
[   24.713804] WARNING: CPU: 0 PID: 11 at kernel/sched/core.c:1179 set_task_cpu+0x157/0x285
[   24.714779] Modules linked in: i2c_piix4 input_leds psmouse i2c_core button
[   24.715437] CPU: 0 PID: 11 Comm: migration/0 Not tainted 4.14.0-rc6-00077-g11224e1 #1
[   24.716188] task: ffff880032955000 task.stack: ffff88003294c000
[   24.716757] RIP: 0010:set_task_cpu+0x157/0x285
[   24.717193] RSP: 0018:ffff88003294fe08 EFLAGS: 00010002
[   24.717746] RAX: 0000000000000001 RBX: ffff880031449000 RCX: 0000000000000000
[   24.718416] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff82e2ad60
[   24.719091] RBP: ffff88003294fe20 R08: 0000000000000000 R09: 0000000000000000
[   24.719789] R10: ffff880014483b60 R11: 0000000000000002 R12: 0000000000000001
[   24.720477] R13: 0000000000000001 R14: ffff88003294fe68 R15: ffff8800347eb300
[   24.721155] FS:  0000000000000000(0000) GS:ffff880034600000(0000) knlGS:0000000000000000
[   24.721947] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   24.722479] CR2: 00007f1cdeec8fd0 CR3: 000000003722d000 CR4: 00000000000006b0
[   24.723144] Call Trace:
[   24.723388]  move_queued_task+0x8c/0x108
[   24.723758]  migration_cpu_stop+0x82/0xb7
[   24.724131]  ? sched_ttwu_pending+0x78/0x78
[   24.724748]  ? sort_range+0x17/0x17
[   24.725214]  cpu_stopper_thread+0x87/0x14c
[   24.725785]  ? sort_range+0x17/0x17
[   24.726194]  smpboot_thread_fn+0x2a9/0x2c0
[   24.726718]  kthread+0xf8/0x100
[   24.727113]  ? kthread_unpark+0x5a/0x5a
[   24.727532]  ret_from_fork+0x25/0x30
[   24.727856] Code: 0f a3 2d 85 d2 e3 01 41 0f 92 c5 41 83 f5 01 31 c9 31 d2 45 0f b6 ed 48 c7 c7 60 ad e2 82 44 89 ee e8 53 24 08 00 45 85 ed 74 02 <0f> ff 31 c9 31 d2 44 89 ee 48 c7 c7 30 ad e2 82 45 31 ed e8 36 
[   24.729721] ---[ end trace b7ae64ce6fa90a4f ]---
[   24.730154] sched: Unexpected reschedule of offline CPU#1!
[   24.730700] ------------[ cut here ]------------
[   24.731133] WARNING: CPU: 0 PID: 11 at arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x69/0x9e
[   24.732162] Modules linked in: i2c_piix4 input_leds psmouse i2c_core button
[   24.732859] CPU: 0 PID: 11 Comm: migration/0 Tainted: G        W       4.14.0-rc6-00077-g11224e1 #1
[   24.733690] task: ffff880032955000 task.stack: ffff88003294c000
[   24.734449] RIP: 0010:native_smp_send_reschedule+0x69/0x9e
[   24.735158] RSP: 0018:ffff88003294fdf8 EFLAGS: 00010092
[   24.735704] RAX: 000000000000002e RBX: 0000000000000001 RCX: 0000000000000000
[   24.736382] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff82e32ef0
[   24.737009] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
[   24.737711] R10: ffff88003294fd30 R11: 0000000000000005 R12: ffff880031449000
[   24.738370] R13: 0000000000000000 R14: ffff88003294fe68 R15: ffff8800347eb300
[   24.738997] FS:  0000000000000000(0000) GS:ffff880034600000(0000) knlGS:0000000000000000
[   24.739754] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   24.740261] CR2: 00007f1cdeec8fd0 CR3: 000000003722d000 CR4: 00000000000006b0
[   24.740921] Call Trace:
[   24.741144]  check_preempt_curr+0x47/0xc1
[   24.741534]  move_queued_task+0xfa/0x108
[   24.741886]  migration_cpu_stop+0x82/0xb7
[   24.742240]  ? sched_ttwu_pending+0x78/0x78
[   24.742649]  ? sort_range+0x17/0x17
[   24.742974]  cpu_stopper_thread+0x87/0x14c
[   24.743401]  ? sort_range+0x17/0x17
[   24.743731]  smpboot_thread_fn+0x2a9/0x2c0
[   24.744111]  kthread+0xf8/0x100
[   24.744414]  ? kthread_unpark+0x5a/0x5a
[   24.744769]  ret_from_fork+0x25/0x30
[   24.745116] Code: ed 74 44 b9 01 00 00 00 31 d2 be 01 00 00 00 48 c7 c7 50 c4 e1 82 e8 e4 d4 0d 00 89 de 48 c7 c7 76 4c 86 81 31 c0 e8 e1 5e 08 00 <0f> ff 5b 5d b9 01 00 00 00 31 d2 be 01 00 00 00 48 c7 c7 20 c4 
[   24.746913] ---[ end trace b7ae64ce6fa90a50 ]---

Elapsed time: 2040
BUG: kernel hang in test stage

Thanks,
Fengguang

View attachment "dmesg-vm-vp-1G-1:20171028170943:x86_64-randconfig-n0-10180128:4.14.0-rc6-00077-g11224e1:1" of type "text/plain" (105859 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ