linux-kernel - Re: [PATCH v1] sched/fair: update_pick_idlest() Select group with lowest group_util when idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201106120303.GE3371@techsingularity.net>
Date:   Fri, 6 Nov 2020 12:03:03 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Phil Auld <pauld@...hat.com>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>,
        Peter Puhov <peter.puhov@...aro.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Robert Foley <robert.foley@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>,
        Jirka Hladky <jhladky@...hat.com>
Subject: Re: [PATCH v1] sched/fair: update_pick_idlest() Select group with
 lowest group_util when idle_cpus are equal

On Wed, Nov 04, 2020 at 09:42:05AM +0000, Mel Gorman wrote:
> While it's possible that some other factor masked the impact of the patch,
> the fact it's neutral for two workloads in 5.10-rc2 is suspicious as it
> indicates that if the patch was implemented against 5.10-rc2, it would
> likely not have been merged. I've queued the tests on the remaining
> machines to see if something more conclusive falls out.
> 

It's not as conclusive as I would like. fork_test generally benefits
across the board but I do not put much weight in that.

Otherwise, it's workload and machine-specific.

schbench: (wakeup latency sensitive), all machines benefitted from the
	revert at the low utilisation except one 2-socket haswell machine
	which showed higher variability when the machine was fully
	utilised.

hackbench: Neutral except for the same 2-socket Haswell machine which
	took an 8% performance penalty of 8% for smaller number of groups
	and 4% for higher number of groups.

pipetest: Mostly neutral except for the *same* machine showing an 18%
	performance gain by reverting.

kernbench: Shows small gains at low job counts across the board -- 0.84%
	lowest gain up to 5.93% depending on the machine

gitsource: low utilisation execution of the git test suite. This was
	mostly a win for the revert. For the list of machines tested it was

	 14.48% gain (2 socket but SNC enabled to 4 NUMA nodes)
	neutral      (2 socket broadwell)
	36.37% gain  (1 socket skylake machine)
         3.18% gain  (2 socket broadwell)
	 4.4%        (2 socket EPYC 2)
	 1.85% gain  (2 socket EPYC 1)

While it was clear-cut for 5.9, it's less clear-cut for 5.10-rc2 although
the gitsource shows some severe differences depending on the machine that
is worth being extremely cautious about. I would still prefer a revert
but I'm also extremely biased and I know there are other patches in the
pipeline that may change the picture. A wider battery of tests might
paint a clearer picture but may not be worth the time investment.

So maybe lets just keep an eye on this one. When the scheduler pipeline
dies down a bit (does that happen?), we should at least revisit it.

-- 
Mel Gorman
SUSE Labs