lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 7 Feb 2020 10:42:44 +0000
From:   Quentin Perret <qperret@...gle.com>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     linux-kernel@...r.kernel.org, mingo@...hat.com,
        peterz@...radead.org, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, morten.rasmussen@....com,
        adharmap@...eaurora.org, pkondeti@...eaurora.org
Subject: Re: [PATCH v4 0/4] sched/fair: Capacity aware wakeup rework

On Thursday 06 Feb 2020 at 19:19:53 (+0000), Valentin Schneider wrote:
> Pixel3 (DynamIQ)
> ++++++++++++++++
> 
> Ideally I would have used a DB845C but had a few issues with mine, so I
> went with a mainline-ish Pixel3 instead [1]. It's still the same SoC under
> the hood (Snapdragon 845), which has 4 bigs and 4 LITTLEs:
> 
>   +-------------------------------+
>   |               L3              |
>   +---+---+---+---+---+---+---+---+
>   | L2| L2| L2| L2| L2| L2| L2| L2|
>   +---+---+---+---+---+---+---+---+
>   | L | L | L | L | B | B | B | B |
>   +---+---+---+---+---+---+---+---+
> 
> Default topology (single MC domain)
> -----------------------------------
> 
> 100 iterations of 'hackbench -l 200'
> 
> |      |   -PATCH |   +PATCH | DELTA (%) |
> |------+----------+----------+-----------|
> | mean | 1.131360 | 1.102560 |    -2.546 |
> | std  | 0.116322 | 0.101999 |   -12.313 |
> | min  | 0.935000 | 0.935000 |    +0.000 |
> | 50%  | 1.099000 | 1.097500 |    -0.136 |
> | 75%  | 1.211250 | 1.157750 |    -4.417 |
> | 99%  | 1.401020 | 1.338210 |    -4.483 |
> | max  | 1.502000 | 1.359000 |    -9.521 |
> 
> 100 iterations of 'sysbench --max-time=5 --max-requests=-1 --test=threads --num-threads=8 run':
> 
> |      |      -PATCH |      +PATCH | DELTA (%) |
> |------+-------------+-------------+-----------|
> | mean | 7108.310000 | 8731.610000 |   +22.837 |
> | std  |  199.431854 |  206.826912 |    +3.708 |
> | min  | 6655.000000 | 8251.000000 |   +23.982 |
> | 50%  | 7107.500000 | 8705.000000 |   +22.476 |
> | 75%  | 7255.500000 | 8868.250000 |   +22.228 |
> | 99%  | 7539.540000 | 9155.520000 |   +21.433 |
> | max  | 7593.000000 | 9207.000000 |   +21.256 |
> 
> Phantom domains (MC + DIE)
> --------------------------
> 
> This is mostly included for the sake of completeness.
> 
> 100 iterations of 'sysbench --max-time=5 --max-requests=-1 --test=threads --num-threads=8 run':
> 
> |      |      -PATCH |      +PATCH | DELTA (%) |
> |------+-------------+-------------+-----------|
> | mean | 7317.940000 | 9328.470000 |   +27.474 |
> | std  |  460.372682 |  181.528886 |   -60.569 |
> | min  | 5888.000000 | 8832.000000 |   +50.000 |
> | 50%  | 7271.000000 | 9348.000000 |   +28.566 |
> | 75%  | 7497.500000 | 9477.250000 |   +26.405 |
> | 99%  | 8464.390000 | 9634.160000 |   +13.820 |
> | max  | 8602.000000 | 9650.000000 |   +12.183 |


So, it feels like the most interesting test would be

 'baseline w/ phantom domains' vs 'this patch w/o phantom domains'

right ? The 'baseline w/o phantom domains' case is arguably borked today,
so it isn't that interesting (even though it performs well for the
particular workload you choose here, as expected, but I guess you might
see issues in others).

So, IIUC, based on your results above, that would be:

|      |     base+PD |  patch+noPD | DELTA (%) |
|------+-------------+-------------+-----------|
| mean | 7317.940000 | 8731.610000 |   +19.318 |
| std  |  460.372682 |  206.826912 |   -55.074 |
| min  | 5888.000000 | 8251.000000 |   +40.132 |
| 50%  | 7271.000000 | 8705.000000 |   +19.722 |
| 75%  | 7497.500000 | 8868.250000 |   +18.283 |
| 99%  | 8464.390000 | 9155.520000 |    +8.165 |
| max  | 8602.000000 | 9207.000000 |    +7.033 |

Is that correct ?

If so, this patch series is still a very big win, and I'm all for
getting it merged. But I find it interesting that the results aren't as
good as having this patch _and_ phantom domains at the same time ...

Any idea why having phantom domains helps ? select_idle_capacity()
should behave the same w/ or w/o phantom domains given that you use
sd_asym_cpucapacity directly. I'm guessing something else has an impact
here ? LB / misfit behaving a bit differently perhaps ?

Thanks,
Quentin

Powered by blists - more mailing lists