[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200428132450.24901-2-ggherdovich@suse.cz>
Date: Tue, 28 Apr 2020 15:24:49 +0200
From: Giovanni Gherdovich <ggherdovich@...e.cz>
To: Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...e.de>, Len Brown <lenb@...nel.org>,
"Rafael J . Wysocki" <rjw@...ysocki.net>
Cc: x86@...nel.org, linux-pm@...r.kernel.org,
linux-kernel@...r.kernel.org,
Ricardo Neri <ricardo.neri-calderon@...ux.intel.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Giovanni Gherdovich <ggherdovich@...e.cz>
Subject: [PATCH 1/2] x86, sched: Prevent divisions by zero in frequency invariant accounting
The product mcnt * arch_max_freq_ratio could be zero if it overflows u64.
For context, a large value for arch_max_freq_ratio would be 5000,
corresponding to a turbo_freq/base_freq ratio of 5 (normally it's more like
1500-2000). A large increment frequency for the MPERF counter would be 5GHz
(the base clock of all CPUs on the market today is less than that). With
these figures, a CPU would need to go without a scheduler tick for around 8
days for the u64 overflow to happen. It is unlikely, but the check is
warranted.
In that case it's also appropriate to disable frequency invariant
accounting: the feature relies on measures of the clock frequency done at
every scheduler tick, which need to be "fresh" to be at all meaningful.
Signed-off-by: Giovanni Gherdovich <ggherdovich@...e.cz>
Fixes: 1567c3e3467c ("x86, sched: Add support for frequency invariance")
---
arch/x86/kernel/smpboot.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 8c89e4d9ad28..4718f29a3065 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -2039,6 +2039,14 @@ static void init_freq_invariance(bool secondary)
}
}
+static void disable_freq_invariance_workfn(struct work_struct *work)
+{
+ static_branch_disable(&arch_scale_freq_key);
+}
+
+static DECLARE_WORK(disable_freq_invariance_work,
+ disable_freq_invariance_workfn);
+
DEFINE_PER_CPU(unsigned long, arch_freq_scale) = SCHED_CAPACITY_SCALE;
void arch_scale_freq_tick(void)
@@ -2055,14 +2063,18 @@ void arch_scale_freq_tick(void)
acnt = aperf - this_cpu_read(arch_prev_aperf);
mcnt = mperf - this_cpu_read(arch_prev_mperf);
- if (!mcnt)
- return;
this_cpu_write(arch_prev_aperf, aperf);
this_cpu_write(arch_prev_mperf, mperf);
acnt <<= 2*SCHED_CAPACITY_SHIFT;
mcnt *= arch_max_freq_ratio;
+ if (!mcnt) {
+ pr_warn("Scheduler tick missing for long time, disabling scale-invariant accounting.\n");
+ /* static_branch_disable() acquires a lock and may sleep */
+ schedule_work(&disable_freq_invariance_work);
+ return;
+ }
freq_scale = div64_u64(acnt, mcnt);
--
2.16.4
Powered by blists - more mailing lists