[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5760115C.7040306@arm.com>
Date: Tue, 14 Jun 2016 15:14:52 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Mike Galbraith <umgwanakikbuti@...il.com>,
Peter Zijlstra <peterz@...radead.org>
Cc: Yuyang Du <yuyang.du@...el.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [rfc patch] sched/fair: Use instantaneous load for fork/exec
balancing
On 14/06/16 08:58, Mike Galbraith wrote:
> SUSE's regression testing noticed that...
>
> 0905f04eb21f sched/fair: Fix new task's load avg removed from source CPU in wake_up_new_task()
>
> ...introduced a hackbench regression, and indeed it does. I think this
> regression has more to do with randomness than anything else, but in
> general...
>
> While averaging calms down load balancing, helping to keep migrations
> down to a dull roar, it's not completely wonderful when it comes to
> things that live in the here and now, hackbench being one such.
>
> time sh -c 'for i in `seq 1000`; do hackbench -p -P > /dev/null; done'
>
> real 0m55.397s
> user 0m8.320s
> sys 5m40.789s
>
> echo LB_INSTANTANEOUS_LOAD > /sys/kernel/debug/sched_features
>
> real 0m48.049s
> user 0m6.510s
> sys 5m6.291s
>
> Signed-off-by: Mike Galbraith <umgwanakikbuti@...il.com>
I see similar values on ARM64 (Juno r0: 2xCortex-A57 4xCortex-A53). OK,
1000 invocations of hackbench take a little bit longer but I guess it's
the fork's we're after.
- echo NO_LB_INSTANTANEOUS_LOAD > /sys/kernel/debug/sched_features
time sh -c 'for i in `seq 1000`; do hackbench -p -P > /dev/null; done'
root@...o:~# time sh -c 'for i in `seq 1000`; do hackbench -p -P >
/dev/null; done'
real 10m17.155s
user 2m56.976s
sys 38m0.324s
- echo LB_INSTANTANEOUS_LOAD > /sys/kernel/debug/sched_features
time sh -c 'for i in `seq 1000`; do hackbench -p -P > /dev/null; done'
real 9m49.832s
user 2m42.896s
sys 34m51.452s
- But I get a similar effect in case I initialize se->avg.load_avg w/ 0:
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -680,7 +680,7 @@ void init_entity_runnable_average(struct
sched_entity *se)
* will definitely be update (after enqueue).
*/
sa->period_contrib = 1023;
- sa->load_avg = scale_load_down(se->load.weight);
+ sa->load_avg = scale_load_down(0);
sa->load_sum = sa->load_avg * LOAD_AVG_MAX;
root@...o:~# time sh -c 'for i in `seq 1000`; do hackbench -p -P >
/dev/null; done'
real 9m55.396s
user 2m41.192s
sys 35m6.196s
IMHO, the hackbench performance "boost" w/o 0905f04eb21f is due to the
fact that a new task gets all it's load decayed (making it a small task)
in the __update_load_avg() call in remove_entity_load_avg() because its
se->avg.last_update_time value is 0 which creates a huge time difference
comparing it to cfs_rq->avg.last_update_time. The patch 0905f04eb21f
avoids this and thus the task stays big se->avg.load_avg = 1024.
It can't be a difference in the value of cfs_rq->removed_load_avg
because w/o the patch 0905f04eb21f, we atomic_long_add 0 and with the
patch we bail before the atomic_long_add().
[...]
Powered by blists - more mailing lists