lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtBznUt20QFzeQBPELcmN6+F=BOx09oSqVMzSGvXF5ByHg@mail.gmail.com>
Date:   Fri, 17 Jan 2020 14:08:13 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Mel Gorman <mgorman@...hsingularity.net>,
        Phil Auld <pauld@...hat.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Valentin Schneider <valentin.schneider@....com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Quentin Perret <quentin.perret@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Morten Rasmussen <Morten.Rasmussen@....com>,
        Hillf Danton <hdanton@...a.com>,
        Parth Shah <parth@...ux.ibm.com>,
        Rik van Riel <riel@...riel.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched, fair: Allow a small load imbalance between low
 utilisation SD_NUMA domains v4

Hi Mel,


On Thu, 16 Jan 2020 at 17:35, Mel Gorman <mgorman@...hsingularity.net> wrote:
>
> On Tue, Jan 14, 2020 at 10:13:20AM +0000, Mel Gorman wrote:
> > Changelog since V3
> > o Allow a fixed imbalance a basic comparison with 2 tasks. This turned out to
> >   be as good or better than allowing an imbalance based on the group weight
> >   without worrying about potential spillover of the lower scheduler domains.
> >
> > Changelog since V2
> > o Only allow a small imbalance when utilisation is low to address reports that
> >   higher utilisation workloads were hitting corner cases.
> >
> > Changelog since V1
> > o Alter code flow                                             vincent.guittot
> > o Use idle CPUs for comparison instead of sum_nr_running      vincent.guittot
> > o Note that the division is still in place. Without it and taking
> >   imbalance_adj into account before the cutoff, two NUMA domains
> >   do not converage as being equally balanced when the number of
> >   busy tasks equals the size of one domain (50% of the sum).
> >
> > The CPU load balancer balances between different domains to spread load
> > and strives to have equal balance everywhere. Communicating tasks can
> > migrate so they are topologically close to each other but these decisions
> > are independent. On a lightly loaded NUMA machine, two communicating tasks
> > pulled together at wakeup time can be pushed apart by the load balancer.
> > In isolation, the load balancer decision is fine but it ignores the tasks
> > data locality and the wakeup/LB paths continually conflict. NUMA balancing
> > is also a factor but it also simply conflicts with the load balancer.
> >
> > This patch allows a fixed degree of imbalance of two tasks to exist
> > between NUMA domains regardless of utilisation levels. In many cases,
> > this prevents communicating tasks being pulled apart. It was evaluated
> > whether the imbalance should be scaled to the domain size. However, no
> > additional benefit was measured across a range of workloads and machines
> > and scaling adds the risk that lower domains have to be rebalanced. While
> > this could change again in the future, such a change should specify the
> > use case and benefit.
> >
>
> Any thoughts on whether this is ok for tip or are there suggestions on
> an alternative approach?

I have just finished to run some tests on my system with your patch
and I haven't seen any noticeable any changes so far which was a bit
expected. The tests that I usually run, use more than 4 tasks on my 2
nodes system; the only exception is perf sched  pipe and the results
for this test stays the same with and without your patch. I'm curious
if this impacts Phil's tests which run LU.c benchmark with some
burning cpu tasks

>
> --
> Mel Gorman
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ