linux-kernel - Re: [PATCH v1] sched/fair: update_pick_idlest() Select group with lowest group_util when idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201104094205.GI3306@suse.de>
Date:   Wed, 4 Nov 2020 09:42:05 +0000
From:   Mel Gorman <mgorman@...e.de>
To:     Phil Auld <pauld@...hat.com>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Peter Puhov <peter.puhov@...aro.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Robert Foley <robert.foley@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>,
        Jirka Hladky <jhladky@...hat.com>
Subject: Re: [PATCH v1] sched/fair: update_pick_idlest() Select group with
 lowest group_util when idle_cpus are equal

On Mon, Nov 02, 2020 at 09:44:18AM -0500, Phil Auld wrote:
> > > I have not investigated why because I do not have the bandwidth
> > > to do a detailed study (I was off for a few days and my backlog is
> > > severe). However, I recommend in before v5.10 this be reverted and retried.
> > > If I'm cc'd on v2, I'll run the same tests through the grid and see what
> > > falls out.
> > 
> > I'm going to have a look at the regressions and see if  patches that
> > have been queued for v5.10 or even more recent patch can help or if
> > the patch should be adjusted
> >
> 
> Fwiw, we have pulled this in, along with some of the 5.10-rc1 fixes in this
> area and in the load balancing code.
> 
> We found some load balancing improvements and some minor overall perf
> gains in a few places, but generally did not see any difference from before
> the commit mentioned here.
> 
> I'm wondering, Mel, if you have compared 5.10-rc1? 
> 

The results indicate that reverting on 5.9 would have been the right
decision. It's less clear for 5.10-rc2 so I'm only showing the 5.10-rc2
comparison. Bear in mind that this is one machine only so I'll be
rerunning against all the affected machines according to the bisections
run against 5.9.

aim9
                                5.10.0-rc2             5.10.0-rc2
                                   vanilla        5.10-rc2-revert
Hmean     page_test   510863.13 (   0.00%)   517673.91 *   1.33%*
Hmean     brk_test   1807400.76 (   0.00%)  1814815.29 *   0.41%*
Hmean     exec_test      821.41 (   0.00%)      841.05 *   2.39%*
Hmean     fork_test     4399.71 (   0.00%)     5124.86 *  16.48%*

Reverting shows a 16.48% gain for fork_test and minor gains for others.
To be fair, I don't generally consider the fork_test to be particularly
important because fork microbenchmarks that do no real work are rarely
representative of anything useful. It tends to go up and down a lot and
it's rare a regression in fork_test correlates to anything else.

Hackbench failed to run because I typo'd the configuration. Kernel build
benchmark and git test suite both were inconclusive for 5.10-rc2
(neutral results) although the showed 10-20% gain for kernbench and 24%
gain in git test suite by reverting in 5.9.

The gitsource test was interesting for a few reasons. First, the big
difference between 5.9 and 5.10 is that the workload is mostly concentrated
on one NUMA node. mpstat shows that 5.10-rc2 uses all of the CPUs on one
node lightly. Reverting the patch shows that far fewer CPUs are used at
a higher utilisation -- not particularly high utilisation because of the
nature of the workload but noticable. i.e.  gitsource with the revert
packs the workload onto fewer CPUs. The same holds for fork_test --
reverting packs the workload onto fewer CPUs with higher utilisation on
each of them. Generally this plays well with cpufreq without schedutil
using fewer CPUs means the CPU is likely to reach higher frequencies.

While it's possible that some other factor masked the impact of the patch,
the fact it's neutral for two workloads in 5.10-rc2 is suspicious as it
indicates that if the patch was implemented against 5.10-rc2, it would
likely not have been merged. I've queued the tests on the remaining
machines to see if something more conclusive falls out.

-- 
Mel Gorman
SUSE Labs