linux-kernel - Re: [PATCH v1] sched/fair: update_pick_idlest() Select group with lowest group_util when idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20201110140507.GI3371@techsingularity.net>
Date:   Tue, 10 Nov 2020 14:05:07 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Phil Auld <pauld@...hat.com>, Peter Puhov <peter.puhov@...aro.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Robert Foley <robert.foley@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>,
        Jirka Hladky <jhladky@...hat.com>
Subject: Re: [PATCH v1] sched/fair: update_pick_idlest() Select group with
 lowest group_util when idle_cpus are equal

On Mon, Nov 09, 2020 at 04:49:07PM +0100, Vincent Guittot wrote:
> > This aspect is somewhat critical because the patches affect CPU
> > selection. If a mostly idle CPU is used due to spreading load wider,
> > it can take longer to ramp up to the highest frequency. It can be a
> > dominating factor and may account for some of the differences.
> 
> I agree but that also highlights that the problem comes from frequency
> selection more than task placement. In such a case, instead of trying
> to bias task placement to compensate for wrong freq selection, we
> should look at the freq selection itself. Not sure if it's the case
> but it's worth identifying if perf regression comes from task
> placement and data locality or from freq selection
> 

That's a fair point although it's worth noting the biasing the freq
selection itself means that schedutil needs to become the default which is
not quite there yet. Otherwise, the machine is often relying on firmware
to give hints as to how quickly it should ramp up or per-driver hacks
which is the road to hell.

> >
> > Generally my tests are not based on the performance governor because a)
> > it's not a universal win and b) the powersave/ondemand govenors should
> > be able to function reasonably well. For short-lived workloads it may
> > not matter but ultimately schedutil should be good enough that it can
> 
> Yeah, schedutil should be able to manage this. But there is another
> place which impacts benchmark which are based on a lot of fork/exec :
> the initial value of task's PELT signal. Current implementation tries
> to accommodate both perf and embedded system but might fail to satisfy
> any of them at the end.
> 

Quite likely. Assuming schedutil gets the default, it may be necessary
to either have a tunable or a kconfig that affects the initial PELT
signal as to whether it should start low and ramp up, pick a midpoint or
start high and scale down.

-- 
Mel Gorman
SUSE Labs