[<prev] [next>] [day] [month] [year] [list]
Message-Id: <1471262806-10789-1-git-send-email-wanpeng.li@hotmail.com>
Date: Mon, 15 Aug 2016 20:06:46 +0800
From: Wanpeng Li <kernellwp@...il.com>
To: linux-kernel@...r.kernel.org
Cc: Wanpeng Li <wanpeng.li@...mail.com>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Radim Krcmar <rkrcmar@...hat.com>,
Mike Galbraith <efault@....de>,
Frederic Weisbecker <fweisbec@...il.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: [PATCH] sched/cputime: Resync time when guest & host lose sync
From: Wanpeng Li <wanpeng.li@...mail.com>
Commit:
57430218317e ("sched/cputime: Count actually elapsed irq & softirq time")
... triggered a regression:
An i5 laptop, 4 pCPUs, 4vCPUs for one full dynticks guest, there are four
cpu hog processes(for loop) running in the guest, I hot-unplug the pCPUs
on host one by one until there is only one left, then observe the top in
guest, there are 100% st for cpu0(housekeeping), and 75% st for other cpus
(nohz full mode). However, w/o this commit, 75% for all the four cpus.
As Rik and Paolo pointed out:
| It turns out that if a guest misses several timer ticks in a row, they
| will simply get lost.
|
| That means the functions calling steal_account_process_time may not know
| how much CPU time has passed since the last time it was called, but
| steal_account_process_time will get a good idea on how much time the host
| spent running something else.
This patch fix it by removing the max cputime limit for tick based sampling,
and keep the limit for vtime in order that steal_account_process_time() will
not attempt to remove more than the limit.
Suggested-by: Rik van Riel <riel@...hat.com>
Suggsted-by: Paolo Bonzini <pbonzini@...hat.com>
Cc: Ingo Molnar <mingo@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Paolo Bonzini <pbonzini@...hat.com>
Cc: Radim Krcmar <rkrcmar@...hat.com>
Cc: Mike Galbraith <efault@....de>
Cc: Frederic Weisbecker <fweisbec@...il.com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Signed-off-by: Wanpeng Li <wanpeng.li@...mail.com>
---
kernel/sched/cputime.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 9858266..a119304 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -263,6 +263,11 @@ void account_idle_time(cputime_t cputime)
cpustat[CPUTIME_IDLE] += (__force u64) cputime;
}
+/*
+ * After a host system is overloaded, the missed clock ticks are not
+ * redelivered to guest later. Due to that, this function may on
+ * occasion account more time than the calling functions think elapsed.
+ */
static __always_inline cputime_t steal_account_process_time(cputime_t maxtime)
{
#ifdef CONFIG_PARAVIRT
@@ -371,7 +376,7 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
* idle, or potentially user or system time. Due to rounding,
* other time can exceed ticks occasionally.
*/
- other = account_other_time(cputime);
+ other = account_other_time(ULONG_MAX);
if (other >= cputime)
return;
cputime -= other;
@@ -486,7 +491,7 @@ void account_process_tick(struct task_struct *p, int user_tick)
}
cputime = cputime_one_jiffy;
- steal = steal_account_process_time(cputime);
+ steal = steal_account_process_time(ULONG_MAX);
if (steal >= cputime)
return;
@@ -516,7 +521,7 @@ void account_idle_ticks(unsigned long ticks)
}
cputime = jiffies_to_cputime(ticks);
- steal = steal_account_process_time(cputime);
+ steal = steal_account_process_time(ULONG_MAX);
if (steal >= cputime)
return;
--
1.9.1
Powered by blists - more mailing lists