lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 9 Feb 2022 10:40:15 +0530 From: K Prateek Nayak <kprateek.nayak@....com> To: Mel Gorman <mgorman@...hsingularity.net> Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>, Vincent Guittot <vincent.guittot@...aro.org>, Valentin Schneider <valentin.schneider@....com>, Aubrey Li <aubrey.li@...ux.intel.com>, Barry Song <song.bao.hua@...ilicon.com>, Mike Galbraith <efault@....de>, Srikar Dronamraju <srikar@...ux.vnet.ibm.com>, Gautham Shenoy <gautham.shenoy@....com>, LKML <linux-kernel@...r.kernel.org> Subject: Re: [PATCH 2/2] sched/fair: Adjust the allowed NUMA imbalance when SD_NUMA spans multiple LLCs Hello Mel, On 2/8/2022 3:13 PM, Mel Gorman wrote: [..snip..] > On a Zen3 machine running STREAM parallelised with OMP to have on instance > per LLC the results and without binding, the results are > > 5.17.0-rc0 5.17.0-rc0 > vanilla sched-numaimb-v6 > MB/sec copy-16 162596.94 ( 0.00%) 580559.74 ( 257.05%) > MB/sec scale-16 136901.28 ( 0.00%) 374450.52 ( 173.52%) > MB/sec add-16 157300.70 ( 0.00%) 564113.76 ( 258.62%) > MB/sec triad-16 151446.88 ( 0.00%) 564304.24 ( 272.61%) I was able to test STREAM without binding on different NPS configurations of two socket Zen3 machine. The results look good: sched-tip - 5.17.0-rc1 tip sched/core mel-v6 - 5.17.0-rc1 tip sched/core + this patch ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Stream with 16 threads. built with -DSTREAM_ARRAY_SIZE=128000000, -DNTIMES=10 Zen3, 64C128T per socket, 2 sockets, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NPS1 Test: sched-tip mel-v6 Copy: 114470.18 (0.00 pct) 152806.94 (33.49 pct) Scale: 111575.12 (0.00 pct) 189784.57 (70.09 pct) Add: 125436.15 (0.00 pct) 213371.05 (70.10 pct) Triad: 123068.86 (0.00 pct) 209809.11 (70.48 pct) NPS2 Test: sched-tip mel-v6 Copy: 57936.28 (0.00 pct) 155038.70 (167.60 pct) Scale: 55599.30 (0.00 pct) 192601.59 (246.41 pct) Add: 63096.96 (0.00 pct) 211462.58 (235.13 pct) Triad: 61983.39 (0.00 pct) 208909.34 (237.04 pct) NPS4 Test: sched-tip mel-v6 Copy: 43946.42 (0.00 pct) 119583.69 (172.11 pct) Scale: 33750.96 (0.00 pct) 180130.83 (433.70 pct) Add: 39109.72 (0.00 pct) 170296.68 (335.43 pct) Triad: 36598.88 (0.00 pct) 169953.47 (364.36 pct) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Stream with 16 threads. built with -DSTREAM_ARRAY_SIZE=128000000, -DNTIMES=100 Zen3, 64C128T per socket, 2 sockets, ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ NPS1 Test: sched-tip mel-v6 Copy: 132402.79 (0.00 pct) 225587.85 (70.37 pct) Scale: 126923.02 (0.00 pct) 214363.58 (68.89 pct) Add: 145596.55 (0.00 pct) 260901.92 (79.19 pct) Triad: 143092.91 (0.00 pct) 249081.79 (74.06 pct) NPS 2 Test: sched-tip mel-v6 Copy: 107386.27 (0.00 pct) 227623.31 (111.96 pct) Scale: 100941.44 (0.00 pct) 218116.63 (116.08 pct) Add: 115854.52 (0.00 pct) 272756.95 (135.43 pct) Triad: 113369.96 (0.00 pct) 260235.32 (129.54 pct) NPS4 Test: sched-tip mel-v6 Copy: 91083.07 (0.00 pct) 247163.90 (171.36 pct) Scale: 90352.54 (0.00 pct) 223914.31 (147.82 pct) Add: 101973.98 (0.00 pct) 272842.42 (167.56 pct) Triad: 99773.65 (0.00 pct) 258904.54 (159.49 pct) There is a significant improvement throughout the board with v6 outperforming tip/sched/core in every case! Tested-by: K Prateek Nayak <kprateek.nayak@....com> -- Thanks and Regards Prateek
Powered by blists - more mailing lists