lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 23 Dec 2020 14:23:41 +0100 From: Vincent Guittot <vincent.guittot@...aro.org> To: "Li, Aubrey" <aubrey.li@...ux.intel.com> Cc: Peter Zijlstra <peterz@...radead.org>, Mel Gorman <mgorman@...hsingularity.net>, linux-kernel <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>, Valentin Schneider <valentin.schneider@....com>, Qais Yousef <qais.yousef@....com>, Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, Tim Chen <tim.c.chen@...ux.intel.com>, Jiang Biao <benbjiang@...il.com> Subject: Re: [RFC][PATCH 0/5] select_idle_sibling() wreckage On Wed, 16 Dec 2020 at 19:07, Vincent Guittot <vincent.guittot@...aro.org> wrote: > > On Wed, 16 Dec 2020 at 14:00, Li, Aubrey <aubrey.li@...ux.intel.com> wrote: > > > > Hi Peter, > > > > On 2020/12/15 0:48, Peter Zijlstra wrote: > > > Hai, here them patches Mel asked for. They've not (yet) been through the > > > robots, so there might be some build fail for configs I've not used. > > > > > > Benchmark time :-) > > > > > > > Here is the data on my side, benchmarks were tested on a x86 4 sockets system > > with 24 cores per socket and 2 hyperthreads per core, total 192 CPUs. > > > > uperf throughput: netperf workload, tcp_nodelay, r/w size = 90 > > > > threads baseline-avg %std patch-avg %std > > 96 1 0.78 1.0072 1.09 > > 144 1 0.58 1.0204 0.83 > > 192 1 0.66 1.0151 0.52 > > 240 1 2.08 0.8990 0.75 > > > > hackbench: process mode, 25600 loops, 40 file descriptors per group > > > > group baseline-avg %std patch-avg %std > > 2(80) 1 10.02 1.0339 9.94 > > 3(120) 1 6.69 1.0049 6.92 > > 4(160) 1 6.76 0.8663 8.74 > > 5(200) 1 2.96 0.9651 4.28 > > > > schbench: 99th percentile latency, 16 workers per message thread > > > > mthread baseline-avg %std patch-avg %std > > 6(96) 1 0.88 1.0055 0.81 > > 9(144) 1 0.59 1.0007 0.37 > > 12(192) 1 0.61 0.9973 0.82 > > 15(240) 1 25.05 0.9251 18.36 > > > > sysbench mysql throughput: read/write, table size = 10,000,000 > > > > thread baseline-avg %std patch-avg %std > > 96 1 6.62 0.9668 4.04 > > 144 1 9.29 0.9579 6.53 > > 192 1 9.52 0.9503 5.35 > > 240 1 8.55 0.9657 3.34 > > > > It looks like > > - hackbench has a significant improvement of 4 groups > > - uperf has a significant regression of 240 threads > > Tests are still running on my side but early results shows perf > regression for hackbench Few more results before being off: On small embedded system, the problem seems to be mainly a matter of setting the right number of loops. On large smt system, The system on which I usually run my tests if off for now so i haven't been able to finalize tests yet but the problem might be that we don't loop all core anymore with this patchset compare to current algorithm > > > > > Please let me know if you have any interested cases I can run/rerun. > > > > Thanks, > > -Aubrey
Powered by blists - more mailing lists