[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <tip-08d072599234c959b0b82b63fa252c129225a899@git.kernel.org>
Date: Fri, 2 Sep 2016 01:31:11 -0700
From: tip-bot for Wanpeng Li <tipbot@...or.com>
To: linux-tip-commits@...r.kernel.org
Cc: linux-kernel@...r.kernel.org, wanpeng.li@...mail.com,
pbonzini@...hat.com, rkrcmar@...hat.com,
sanjeev.yadav@...eadtrum.com, peterz@...radead.org,
mingo@...nel.org, tglx@...utronix.de, gaurav.jindal@...eadtrum.com,
hpa@...or.com
Subject: [tip:timers/urgent] tick/nohz: Fix softlockup on scheduler stalls
in kvm guest
Commit-ID: 08d072599234c959b0b82b63fa252c129225a899
Gitweb: http://git.kernel.org/tip/08d072599234c959b0b82b63fa252c129225a899
Author: Wanpeng Li <wanpeng.li@...mail.com>
AuthorDate: Fri, 2 Sep 2016 14:38:23 +0800
Committer: Thomas Gleixner <tglx@...utronix.de>
CommitDate: Fri, 2 Sep 2016 10:25:40 +0200
tick/nohz: Fix softlockup on scheduler stalls in kvm guest
tick_nohz_start_idle() is prevented to be called if the idle tick can't
be stopped since commit 1f3b0f8243cb934 ("tick/nohz: Optimize nohz idle
enter"). As a result, after suspend/resume the host machine, full dynticks
kvm guest will softlockup:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [swapper/0:0]
Call Trace:
default_idle+0x31/0x1a0
arch_cpu_idle+0xf/0x20
default_idle_call+0x2a/0x50
cpu_startup_entry+0x39b/0x4d0
rest_init+0x138/0x140
? rest_init+0x5/0x140
start_kernel+0x4c1/0x4ce
? set_init_arg+0x55/0x55
? early_idt_handler_array+0x120/0x120
x86_64_start_reservations+0x24/0x26
x86_64_start_kernel+0x142/0x14f
In addition, cat /proc/stat | grep cpu in guest or host:
cpu 398 16 5049 15754 5490 0 1 46 0 0
cpu0 206 5 450 0 0 0 1 14 0 0
cpu1 81 0 3937 3149 1514 0 0 9 0 0
cpu2 45 6 332 6052 2243 0 0 11 0 0
cpu3 65 2 328 6552 1732 0 0 11 0 0
The idle and iowait states are weird 0 for cpu0(housekeeping).
The bug is present in both guest and host kernels, and they both have
cpu0's idle and iowait states issue, however, host kernel's suspend/resume
path etc will touch watchdog to avoid the softlockup.
- The watchdog will not be touched in tick_nohz_stop_idle path (need be
touched since the scheduler stall is expected) if idle_active flags are
not detected.
- The idle and iowait states will not be accounted when exit idle loop
(resched or interrupt) if idle start time and idle_active flags are
not set.
This patch fixes it by reverting commit 1f3b0f8243cb934 since can't stop
idle tick doesn't mean can't be idle.
Fixes: 1f3b0f8243cb934 ("tick/nohz: Optimize nohz idle enter")
Signed-off-by: Wanpeng Li <wanpeng.li@...mail.com>
Cc: Sanjeev Yadav<sanjeev.yadav@...eadtrum.com>
Cc: Gaurav Jindal<gaurav.jindal@...eadtrum.com>
Cc: stable@...r.kernel.org
Cc: kvm@...r.kernel.org
Cc: Radim Krčmář <rkrcmar@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Paolo Bonzini <pbonzini@...hat.com>
Link: http://lkml.kernel.org/r/1472798303-4154-1-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
---
kernel/time/tick-sched.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 204fdc8..2ec7c00 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -908,10 +908,11 @@ static void __tick_nohz_idle_enter(struct tick_sched *ts)
ktime_t now, expires;
int cpu = smp_processor_id();
+ now = tick_nohz_start_idle(ts);
+
if (can_stop_idle_tick(cpu, ts)) {
int was_stopped = ts->tick_stopped;
- now = tick_nohz_start_idle(ts);
ts->idle_calls++;
expires = tick_nohz_stop_sched_tick(ts, now, cpu);
Powered by blists - more mailing lists