lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 7 Feb 2020 10:42:44 +0000 From: Quentin Perret <qperret@...gle.com> To: Valentin Schneider <valentin.schneider@....com> Cc: linux-kernel@...r.kernel.org, mingo@...hat.com, peterz@...radead.org, vincent.guittot@...aro.org, dietmar.eggemann@....com, morten.rasmussen@....com, adharmap@...eaurora.org, pkondeti@...eaurora.org Subject: Re: [PATCH v4 0/4] sched/fair: Capacity aware wakeup rework On Thursday 06 Feb 2020 at 19:19:53 (+0000), Valentin Schneider wrote: > Pixel3 (DynamIQ) > ++++++++++++++++ > > Ideally I would have used a DB845C but had a few issues with mine, so I > went with a mainline-ish Pixel3 instead [1]. It's still the same SoC under > the hood (Snapdragon 845), which has 4 bigs and 4 LITTLEs: > > +-------------------------------+ > | L3 | > +---+---+---+---+---+---+---+---+ > | L2| L2| L2| L2| L2| L2| L2| L2| > +---+---+---+---+---+---+---+---+ > | L | L | L | L | B | B | B | B | > +---+---+---+---+---+---+---+---+ > > Default topology (single MC domain) > ----------------------------------- > > 100 iterations of 'hackbench -l 200' > > | | -PATCH | +PATCH | DELTA (%) | > |------+----------+----------+-----------| > | mean | 1.131360 | 1.102560 | -2.546 | > | std | 0.116322 | 0.101999 | -12.313 | > | min | 0.935000 | 0.935000 | +0.000 | > | 50% | 1.099000 | 1.097500 | -0.136 | > | 75% | 1.211250 | 1.157750 | -4.417 | > | 99% | 1.401020 | 1.338210 | -4.483 | > | max | 1.502000 | 1.359000 | -9.521 | > > 100 iterations of 'sysbench --max-time=5 --max-requests=-1 --test=threads --num-threads=8 run': > > | | -PATCH | +PATCH | DELTA (%) | > |------+-------------+-------------+-----------| > | mean | 7108.310000 | 8731.610000 | +22.837 | > | std | 199.431854 | 206.826912 | +3.708 | > | min | 6655.000000 | 8251.000000 | +23.982 | > | 50% | 7107.500000 | 8705.000000 | +22.476 | > | 75% | 7255.500000 | 8868.250000 | +22.228 | > | 99% | 7539.540000 | 9155.520000 | +21.433 | > | max | 7593.000000 | 9207.000000 | +21.256 | > > Phantom domains (MC + DIE) > -------------------------- > > This is mostly included for the sake of completeness. > > 100 iterations of 'sysbench --max-time=5 --max-requests=-1 --test=threads --num-threads=8 run': > > | | -PATCH | +PATCH | DELTA (%) | > |------+-------------+-------------+-----------| > | mean | 7317.940000 | 9328.470000 | +27.474 | > | std | 460.372682 | 181.528886 | -60.569 | > | min | 5888.000000 | 8832.000000 | +50.000 | > | 50% | 7271.000000 | 9348.000000 | +28.566 | > | 75% | 7497.500000 | 9477.250000 | +26.405 | > | 99% | 8464.390000 | 9634.160000 | +13.820 | > | max | 8602.000000 | 9650.000000 | +12.183 | So, it feels like the most interesting test would be 'baseline w/ phantom domains' vs 'this patch w/o phantom domains' right ? The 'baseline w/o phantom domains' case is arguably borked today, so it isn't that interesting (even though it performs well for the particular workload you choose here, as expected, but I guess you might see issues in others). So, IIUC, based on your results above, that would be: | | base+PD | patch+noPD | DELTA (%) | |------+-------------+-------------+-----------| | mean | 7317.940000 | 8731.610000 | +19.318 | | std | 460.372682 | 206.826912 | -55.074 | | min | 5888.000000 | 8251.000000 | +40.132 | | 50% | 7271.000000 | 8705.000000 | +19.722 | | 75% | 7497.500000 | 8868.250000 | +18.283 | | 99% | 8464.390000 | 9155.520000 | +8.165 | | max | 8602.000000 | 9207.000000 | +7.033 | Is that correct ? If so, this patch series is still a very big win, and I'm all for getting it merged. But I find it interesting that the results aren't as good as having this patch _and_ phantom domains at the same time ... Any idea why having phantom domains helps ? select_idle_capacity() should behave the same w/ or w/o phantom domains given that you use sd_asym_cpucapacity directly. I'm guessing something else has an impact here ? LB / misfit behaving a bit differently perhaps ? Thanks, Quentin
Powered by blists - more mailing lists