lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 27 Apr 2016 01:02:24 +0200 From: Ben Hutchings <ben@...adent.org.uk> To: linux-kernel@...r.kernel.org, stable@...r.kernel.org CC: akpm@...ux-foundation.org, "Linus Torvalds" <torvalds@...ux-foundation.org>, "Peter Zijlstra" <peterz@...radead.org>, "Rik van Riel" <riel@...hat.com>, "Thomas Gleixner" <tglx@...utronix.de>, "Glauber Costa" <glommer@...allels.com>, "Frederic Weisbecker" <fweisbec@...il.com>, "Ingo Molnar" <mingo@...nel.org> Subject: [PATCH 3.2 021/115] sched/cputime: Fix steal time accounting vs. CPU hotplug 3.2.80-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Thomas Gleixner <tglx@...utronix.de> commit e9532e69b8d1d1284e8ecf8d2586de34aec61244 upstream. On CPU hotplug the steal time accounting can keep a stale rq->prev_steal_time value over CPU down and up. So after the CPU comes up again the delta calculation in steal_account_process_tick() wreckages itself due to the unsigned math: u64 steal = paravirt_steal_clock(smp_processor_id()); steal -= this_rq()->prev_steal_time; So if steal is smaller than rq->prev_steal_time we end up with an insane large value which then gets added to rq->prev_steal_time, resulting in a permanent wreckage of the accounting. As a consequence the per CPU stats in /proc/stat become stale. Nice trick to tell the world how idle the system is (100%) while the CPU is 100% busy running tasks. Though we prefer realistic numbers. None of the accounting values which use a previous value to account for fractions is reset at CPU hotplug time. update_rq_clock_task() has a sanity check for prev_irq_time and prev_steal_time_rq, but that sanity check solely deals with clock warps and limits the /proc/stat visible wreckage. The prev_time values are still wrong. Solution is simple: Reset rq->prev_*_time when the CPU is plugged in again. Signed-off-by: Thomas Gleixner <tglx@...utronix.de> Acked-by: Rik van Riel <riel@...hat.com> Cc: Frederic Weisbecker <fweisbec@...il.com> Cc: Glauber Costa <glommer@...allels.com> Cc: Linus Torvalds <torvalds@...ux-foundation.org> Cc: Peter Zijlstra <peterz@...radead.org> Fixes: commit 095c0aa83e52 "sched: adjust scheduler cpu power for stolen time" Fixes: commit aa483808516c "sched: Remove irq time from available CPU power" Fixes: commit e6e6685accfa "KVM guest: Steal time accounting" Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603041539490.3686@nanos Signed-off-by: Ingo Molnar <mingo@...nel.org> [bwh: Backported to 3.2: adjust filenames] Signed-off-by: Ben Hutchings <ben@...adent.org.uk> --- --- a/kernel/sched.c +++ b/kernel/sched.c @@ -2084,6 +2084,19 @@ EXPORT_SYMBOL_GPL(account_system_vtime); #endif /* CONFIG_IRQ_TIME_ACCOUNTING */ +static inline void account_reset_rq(struct rq *rq) +{ +#ifdef CONFIG_IRQ_TIME_ACCOUNTING + rq->prev_irq_time = 0; +#endif +#ifdef CONFIG_PARAVIRT + rq->prev_steal_time = 0; +#endif +#ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING + rq->prev_steal_time_rq = 0; +#endif +} + #ifdef CONFIG_PARAVIRT static inline u64 steal_ticks(u64 steal) { @@ -6851,6 +6864,7 @@ migration_call(struct notifier_block *nf case CPU_UP_PREPARE: rq->calc_load_update = calc_load_update; + account_reset_rq(rq); break; case CPU_ONLINE:
Powered by blists - more mailing lists