lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1547631791-16018-4-git-send-email-vincent.guittot@linaro.org>
Date:   Wed, 16 Jan 2019 10:43:11 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     peterz@...radead.org, mingo@...nel.org,
        linux-kernel@...r.kernel.org
Cc:     rjw@...ysocki.net, dietmar.eggemann@....com,
        Morten.Rasmussen@....com, patrick.bellasi@....com, pjt@...gle.com,
        bsegall@...gle.com, thara.gopinath@...aro.org,
        pkondeti@...eaurora.org, quentin.perret@....com,
        srinivas.pandruvada@...ux.intel.com,
        Vincent Guittot <vincent.guittot@...aro.org>
Subject: [PATCH v8 3/3] sched/pelt: skip updating util_est when utilization is higher than cpu's capacity

util_est is mainly meant to be a lower-bound for tasks utilization.
That's why task_util_est() returns the actual util_avg when it's higher
than the estimated utilization.

With new invaraince signal and without any special check on samples
collection, if a task is limited because of thermal capping for
example, we could end up overestimating its utilization and thus
perhaps generating an unwanted frequency spike when the capping is
relaxed... and (even worst) it will take some more activations for the
estimated utilization to converge back to the actual utilization.

Since we cannot easily know if there is idle time in a CPU when a task
completes an activation with a utilization higher then the CPU capacity,
we skip the sampling when utilization is higher than cpu's capacity.

Suggested-by: Patrick Bellasi <patrick.bellasi@....com>
Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
---
 kernel/sched/fair.c  | 14 +++++++++-----
 kernel/sched/sched.h |  7 +++++++
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9332863..2262c8a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3639,6 +3639,7 @@ util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep)
 {
 	long last_ewma_diff;
 	struct util_est ue;
+	int cpu;
 
 	if (!sched_feat(UTIL_EST))
 		return;
@@ -3673,6 +3674,14 @@ util_est_dequeue(struct cfs_rq *cfs_rq, struct task_struct *p, bool task_sleep)
 		return;
 
 	/*
+	 * To avoid overestimation of actual task utilization, skip updates if
+	 * we cannot grant there is idle time in this CPU.
+	 */
+	cpu = cpu_of(rq_of(cfs_rq));
+	if (task_util(p) > capacity_orig_of(cpu))
+		return;
+
+	/*
 	 * Update Task's estimated utilization
 	 *
 	 * When *p completes an activation we can consolidate another sample
@@ -5541,11 +5550,6 @@ static unsigned long capacity_of(int cpu)
 	return cpu_rq(cpu)->cpu_capacity;
 }
 
-static unsigned long capacity_orig_of(int cpu)
-{
-	return cpu_rq(cpu)->cpu_capacity_orig;
-}
-
 static unsigned long cpu_avg_load_per_task(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 4c506ea..455745e 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2230,6 +2230,13 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
 # define arch_scale_freq_invariant()	false
 #endif
 
+#ifdef CONFIG_SMP
+static inline unsigned long capacity_orig_of(int cpu)
+{
+	return cpu_rq(cpu)->cpu_capacity_orig;
+}
+#endif
+
 #ifdef CONFIG_CPU_FREQ_GOV_SCHEDUTIL
 /**
  * enum schedutil_type - CPU utilization type
-- 
2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ