[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20171109164117.19401-1-patrick.bellasi@arm.com>
Date: Thu, 9 Nov 2017 16:41:13 +0000
From: Patrick Bellasi <patrick.bellasi@....com>
To: linux-kernel@...r.kernel.org
Cc: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Viresh Kumar <viresh.kumar@...aro.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Paul Turner <pjt@...gle.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Morten Rasmussen <morten.rasmussen@....com>,
Juri Lelli <juri.lelli@...hat.com>,
Todd Kjos <tkjos@...roid.com>,
Joel Fernandes <joelaf@...gle.com>
Subject: [PATCH 0/4] Utilization estimation (util_est) for FAIR tasks
The aim of this series is to improve some PELT behaviors to make it a
better fit for the scheduling of tasks common in embedded mobile
use-cases, without affecting other classes of workloads.
A complete description of these behavior has been presented in the
previous RFC [1] and further discussed at last OSPM Summit [2] as well
as the last two LPCs.
This series presents an implementation which improves the initial RFC's
prototype. Specifically, this new implementation has been verified to
not impact in any noticeable way the performance of:
perf bench sched messaging --pipe --thread --group 8 --loop 50000
when running 30 iterations on a dual socket, 10 cores (20 threads) per
socket Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz, whith the
sched_feat(SCHED_UTILEST) set to False.
With this feature enabled, the measured overhead is in the range of ~1%
on the HW/SW test configuration.
That's the main reason why this sched feature is disabled by default.
A possible improvement can be the addition of a KConfig option to toggle
the sched_feat default value on systems where a 1% overhead on hackbench
is not a concern, e.g. mobile systems, especially considering the
benefits coming from estimated utilization on workloads of interest.
>From a functional standpoint, this implementation shows a more stable
utilization signal, compared to mainline, when running synthetics
benchmarks describing a set of interesting target use-cases.
This allows a better selection of the target CPU as well as a faster
selection of the most appropriate OPP.
A detailed description of the functional tests used has been already
covered in the previous RFC [1].
This series is based on v4.14-rc8 and is composed of four patches:
1) a small refactoring preparing the ground
2) introducing the required data structures to track util_est of both
TASKs and CPUs
3) make use of util_est in the wakeup and load balance paths
4) make use of util_est in schedutil for frequency selection
Cheers Patrick
.:: References
==============
[1] https://lkml.org/lkml/2017/8/25/195
[2] slides: http://retis.sssup.it/ospm-summit/Downloads/OSPM_PELT_DecayClampingVsUtilEst.pdf
video: http://youtu.be/adnSHPBGS-w
Patrick Bellasi (4):
sched/fair: always used unsigned long for utilization
sched/fair: add util_est on top of PELT
sched/fair: use util_est in LB and WU paths
sched/cpufreq_schedutil: use util_est for OPP selection
include/linux/sched.h | 21 +++++
kernel/sched/cpufreq_schedutil.c | 6 +-
kernel/sched/debug.c | 4 +
kernel/sched/fair.c | 184 ++++++++++++++++++++++++++++++++++++---
kernel/sched/features.h | 5 ++
kernel/sched/sched.h | 1 +
6 files changed, 209 insertions(+), 12 deletions(-)
--
2.14.1
Powered by blists - more mailing lists