lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67e245f8-eff3-98e2-68aa-04376f886385@arm.com>
Date:   Tue, 3 Nov 2020 16:36:14 +0100
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>, mingo@...hat.com,
        peterz@...radead.org, juri.lelli@...hat.com, rostedt@...dmis.org,
        bsegall@...gle.com, mgorman@...e.de, linux-kernel@...r.kernel.org,
        valentin.schneider@....com, morten.rasmussen@....com,
        ouwen210@...mail.com
Subject: Re: [PATCH v3] sched/fair: prefer prev cpu in asymmetric wakeup path

On 29/10/2020 17:18, Vincent Guittot wrote:

[...]

> - On hikey960 with performance governor (EAS disable)
> 
> ./perf bench sched pipe -T -l 50000
>              mainline           w/ patch
> # migrations   999364                  0
> ops/sec        149313(+/-0.28%)   182587(+/- 0.40) +22%
> 
> - On hikey with performance governor
> 
> ./perf bench sched pipe -T -l 50000
>              mainline           w/ patch
> # migrations        0                  0
> ops/sec         47721(+/-0.76%)    47899(+/- 0.56) +0.4%

Tested on hikey960 (big cluster 0xf0) with perf gov on tip sched/core +
patch) and defconfig plus:

# CONFIG_ARM_CPUIDLE is not set
# CONFIG_CPU_THERMAL is not set
# CONFIG_HISI_THERMAL is not set

and for 'w/ uclamp' tests:

CONFIG_UCLAMP_TASK=y
CONFIG_UCLAMP_BUCKETS_COUNT=5
CONFIG_UCLAMP_TASK_GROUP=y

(a) perf stat -n -r 20 taskset 0xf0 perf bench sched pipe -T -l 50000

(b) perf stat -n -r 20 -- cgexec -g cpu:A/B taskset 0xf0 perf bench
sched pipe -T -l 50000


(1) w/o uclamp

(a) w/o patch: 0.392850 +- 0.000289 seconds time elapsed  ( +-  0.07% )

    w/  patch: 0.330786 +- 0.000401 seconds time elapsed  ( +-  0.12% )

(b) w/o patch: 0.414644 +- 0.000375 seconds time elapsed  ( +-  0.09% )

    w/  patch: 0.353113 +- 0.000393 seconds time elapsed  ( +-  0.11% )

(2) w/ uclamp

(a) w/o patch: 0.393781 +- 0.000488 seconds time elapsed  ( +-  0.12% )

    w/  patch: 0.342726 +- 0.000661 seconds time elapsed  ( +-  0.19% )

(b) w/o patch: 0.416645 +- 0.000520 seconds time elapsed  ( +-  0.12% )

    w/  patch: 0.358098 +- 0.000577 seconds time elapsed  ( +-  0.16% )

Tested-by: Dietmar Eggemann <dietmar.eggemann@....com>

> According to test on hikey, the patch doesn't impact symmetric system
> compared to current implementation (only tested on arm64)
> 
> Also read the uclamped value of task's utilization at most twice instead
> instead each time we compare task's utilization with cpu's capacity.

task_util could be passed into select_idle_capacity() avoiding the
second call to uclamp_task_util()?

With this I see a small improvement for (a)

(3) w/ uclamp and passing task_util into sic()

(a) w/  patch: 0.337032 +- 0.000564 seconds time elapsed  ( +-  0.17% )

(b) w/  patch: 0.358467 +- 0.000381 seconds time elapsed  ( +-  0.11% )

[...]

> -symmetric:
> -	if (available_idle_cpu(target) || sched_idle_cpu(target))
> +	if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
> +	    asym_fits_capacity(task_util, target))
>  		return target;

Braces because of multi-line condition ?

>  	/*
>  	 * If the previous CPU is cache affine and idle, don't be stupid:
>  	 */
>  	if (prev != target && cpus_share_cache(prev, target) &&
> -	    (available_idle_cpu(prev) || sched_idle_cpu(prev)))
> +	    (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> +	    asym_fits_capacity(task_util, prev))
>  		return prev;

and here ...

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ