[<prev] [next>] [day] [month] [year] [list]
Message-Id: <1521505730-1866-1-git-send-email-rohit.k.jain@oracle.com>
Date: Mon, 19 Mar 2018 17:28:50 -0700
From: Rohit Jain <rohit.k.jain@...cle.com>
To: linux-kernel@...r.kernel.org
Cc: peterz@...radead.org, mingo@...hat.com, steven.sistare@...cle.com,
joelaf@...gle.com, jbacik@...com, juri.lelli@...hat.com,
dhaval.giani@...cle.com, efault@....de, riel@...riel.com
Subject: [PATCH v3] sched/fair: Remove check in idle_balance against migration_cost
Patch "1b9508f6 sched: Rate-limit newidle" reduced the CPU time spent in
idle_balance() by refusing to balance if the average idle time was less
than sysctl_sched_migration_cost. Since then, more refined methods for
reducing CPU time have been added, including dynamic measurement of search
cost in curr_cost and a check for this_rq->rd->overload. The original
check of sysctl_sched_migration_cost is no longer necessary, and is in
fact harmful because it discourages load balancing, so delete it.
1) An internal Oracle RDBMS OLTP test test on an 8-socket Exadata shows
a 2.2% gain in throughput.
2) Hackbench results on 2 socket, 44 core and 88 threads Intel x86 machine
(lower is better):
+--------------+-----------------+-------------------------+
| | Without Patch |With Patch |
+------+-------+--------+--------+----------------+--------+
|Loops | Groups|Average |%Std Dev|Average |%Std Dev|
+------+-------+--------+--------+----------------+--------+
|100000| 4 |8.313 |0.64 |8.284 (+0.35%) |2.09 |
|100000| 8 |14.606 |1.73 |14.451 (+1.07%) |1.32 |
|100000| 16 |26.203 |0.72 |25.509 (+2.65%) |0.19 |
|100000| 25 |38.270 |0.20 |36.545 (+4.51%) |0.30 |
+------+-------+--------+--------+----------------+--------+
3) tbench sample results on 2 socket, 44 core and 88 threads Intel x86
machine:
Without Patch:
Throughput 670.753 MB/sec 2 clients 2 procs max_latency=0.150 ms
Throughput 1530.57 MB/sec 5 clients 5 procs max_latency=0.366 ms
Throughput 2911.36 MB/sec 10 clients 10 procs max_latency=0.917 ms
Throughput 5626.88 MB/sec 20 clients 20 procs max_latency=5.037 ms
Throughput 8925.31 MB/sec 40 clients 40 procs max_latency=7.510 ms
With Patch:
Throughput 672.377 MB/sec 2 clients 2 procs max_latency=0.269 ms
Throughput 1562.44 MB/sec 5 clients 5 procs max_latency=5.774 ms
Throughput 2973.76 MB/sec 10 clients 10 procs max_latency=0.527 ms
Throughput 5726.74 MB/sec 20 clients 20 procs max_latency=2.187 ms
Throughput 9162.58 MB/sec 40 clients 40 procs max_latency=4.713 ms
Changelog:
* v1->v2: Changed the per-domain accounting of load-balance cost to just
removing the check against overall migration_cost, which works well.
* v2->v3: Pulled to the latest source code and re-tested the benchmarks.
Signed-off-by: Rohit Jain <rohit.k.jain@...cle.com>
---
kernel/sched/fair.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3582117..da619fb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9587,8 +9587,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
*/
rq_unpin_lock(this_rq, rf);
- if (this_rq->avg_idle < sysctl_sched_migration_cost ||
- !this_rq->rd->overload) {
+ if (!this_rq->rd->overload) {
rcu_read_lock();
sd = rcu_dereference_check_sched_domain(this_rq->sd);
--
2.7.4
Powered by blists - more mailing lists