[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67e245f8-eff3-98e2-68aa-04376f886385@arm.com>
Date: Tue, 3 Nov 2020 16:36:14 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, rostedt@...dmis.org,
bsegall@...gle.com, mgorman@...e.de, linux-kernel@...r.kernel.org,
valentin.schneider@....com, morten.rasmussen@....com,
ouwen210@...mail.com
Subject: Re: [PATCH v3] sched/fair: prefer prev cpu in asymmetric wakeup path
On 29/10/2020 17:18, Vincent Guittot wrote:
[...]
> - On hikey960 with performance governor (EAS disable)
>
> ./perf bench sched pipe -T -l 50000
> mainline w/ patch
> # migrations 999364 0
> ops/sec 149313(+/-0.28%) 182587(+/- 0.40) +22%
>
> - On hikey with performance governor
>
> ./perf bench sched pipe -T -l 50000
> mainline w/ patch
> # migrations 0 0
> ops/sec 47721(+/-0.76%) 47899(+/- 0.56) +0.4%
Tested on hikey960 (big cluster 0xf0) with perf gov on tip sched/core +
patch) and defconfig plus:
# CONFIG_ARM_CPUIDLE is not set
# CONFIG_CPU_THERMAL is not set
# CONFIG_HISI_THERMAL is not set
and for 'w/ uclamp' tests:
CONFIG_UCLAMP_TASK=y
CONFIG_UCLAMP_BUCKETS_COUNT=5
CONFIG_UCLAMP_TASK_GROUP=y
(a) perf stat -n -r 20 taskset 0xf0 perf bench sched pipe -T -l 50000
(b) perf stat -n -r 20 -- cgexec -g cpu:A/B taskset 0xf0 perf bench
sched pipe -T -l 50000
(1) w/o uclamp
(a) w/o patch: 0.392850 +- 0.000289 seconds time elapsed ( +- 0.07% )
w/ patch: 0.330786 +- 0.000401 seconds time elapsed ( +- 0.12% )
(b) w/o patch: 0.414644 +- 0.000375 seconds time elapsed ( +- 0.09% )
w/ patch: 0.353113 +- 0.000393 seconds time elapsed ( +- 0.11% )
(2) w/ uclamp
(a) w/o patch: 0.393781 +- 0.000488 seconds time elapsed ( +- 0.12% )
w/ patch: 0.342726 +- 0.000661 seconds time elapsed ( +- 0.19% )
(b) w/o patch: 0.416645 +- 0.000520 seconds time elapsed ( +- 0.12% )
w/ patch: 0.358098 +- 0.000577 seconds time elapsed ( +- 0.16% )
Tested-by: Dietmar Eggemann <dietmar.eggemann@....com>
> According to test on hikey, the patch doesn't impact symmetric system
> compared to current implementation (only tested on arm64)
>
> Also read the uclamped value of task's utilization at most twice instead
> instead each time we compare task's utilization with cpu's capacity.
task_util could be passed into select_idle_capacity() avoiding the
second call to uclamp_task_util()?
With this I see a small improvement for (a)
(3) w/ uclamp and passing task_util into sic()
(a) w/ patch: 0.337032 +- 0.000564 seconds time elapsed ( +- 0.17% )
(b) w/ patch: 0.358467 +- 0.000381 seconds time elapsed ( +- 0.11% )
[...]
> -symmetric:
> - if (available_idle_cpu(target) || sched_idle_cpu(target))
> + if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
> + asym_fits_capacity(task_util, target))
> return target;
Braces because of multi-line condition ?
> /*
> * If the previous CPU is cache affine and idle, don't be stupid:
> */
> if (prev != target && cpus_share_cache(prev, target) &&
> - (available_idle_cpu(prev) || sched_idle_cpu(prev)))
> + (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> + asym_fits_capacity(task_util, prev))
> return prev;
and here ...
[...]
Powered by blists - more mailing lists