lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200215215823.GY3420@suse.de>
Date:   Sat, 15 Feb 2020 21:58:24 +0000
From:   Mel Gorman <mgorman@...e.de>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        linux-kernel@...r.kernel.org, pauld@...hat.com,
        parth@...ux.ibm.com, valentin.schneider@....com, hdanton@...a.com
Subject: Re: [PATCH v2 0/5] remove runnable_load_avg and improve
 group_classify

On Fri, Feb 14, 2020 at 04:27:24PM +0100, Vincent Guittot wrote:
> This new version stays quite close to the previous one and should 
> replace without problems the previous one that part of Mel's patchset:
> https://lkml.org/lkml/2020/2/14/156
> 

As far as I can see, the differences are harmless and look sane. I do think
that an additional fix is mandatory as I see no reason why the regression
was fixed. As such, I'll release a v3 of the series that includes your
new patches with the minimal fix inserted where appropriate. I'll have
tests running over the rest of the weekend.

> Some hackbench results:
> 
> - small arm64 dual quad cores system
> hackbench -l (2560/#grp) -g #grp
> 
> grp    tip/sched/core         +patchset              improvement
> 1       1,327(+/-10,06 %)     1,247(+/-5,45 %)       5,97 %
> 4       1,250(+/- 2,55 %)     1,207(+/-2,12 %)       3,42 %
> 8       1,189(+/- 1,47 %)     1,179(+/-1,93 %)       0,90 %
> 16      1,221(+/- 3,25 %)     1,219(+/-2,44 %)       0,16 %						
> 
> - large arm64 2 nodes / 224 cores system
> hackbench -l (256000/#grp) -g #grp
> 
> grp    tip/sched/core         +patchset              improvement
> 1      14,197(+/- 2,73 %)     13,917(+/- 2,19 %)     1,98 %
> 4       6,817(+/- 1,27 %)      6,523(+/-11,96 %)     4,31 %
> 16      2,930(+/- 1,07 %)      2,911(+/- 1,08 %)     0,66 %
> 32      2,735(+/- 1,71 %)      2,725(+/- 1,53 %)     0,37 %
> 64      2,702(+/- 0,32 %)      2,717(+/- 1,07 %)    -0,53 %
> 128     3,533(+/-14,66 %)     3,123(+/-12,47 %)     11,59 %
> 256     3,918(+/-19,93 %)     3,390(+/- 5,93 %)     13,47 %
> 
> The significant improvement for 128 and 256 should be taken with care
> because of some instabilities over iterations without the patchset.
> 

For the most part I do not see similar results to this with hackbench with
one exception -- EPYC first generation. I don't have results for EPYC 2
yet but I'm curious if the machine you have has multiple L3 caches per
NUMA domain? Various Intel CPU generations show improvements but they're
not as dramatic.  Tests will tell me for sure but I have some confidence
that it'll look like

Small tracing patches -- no difference
Vincent Patches 1-2   -- regressions
Fix from Mel          -- small overall improvement on baseline
Vincent patches 3-5   -- small improvements mostly, sometimes big ones
                         on hackbench depending on the machine
Rest of Mel series    -- generally ok across machines and CPU generations

Even if the improvements are not dramatic, I think it'll be worth it to
have NUMA and CPU balancer using similarly sane logic and overall I find
the load balancer easier to understand with the new logic so yey!

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ