[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200116163529.GP3466@techsingularity.net>
Date: Thu, 16 Jan 2020 16:35:29 +0000
From: Mel Gorman <mgorman@...hsingularity.net>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Phil Auld <pauld@...hat.com>, Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Valentin Schneider <valentin.schneider@....com>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
Quentin Perret <quentin.perret@....com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Morten Rasmussen <Morten.Rasmussen@....com>,
Hillf Danton <hdanton@...a.com>,
Parth Shah <parth@...ux.ibm.com>,
Rik van Riel <riel@...riel.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched, fair: Allow a small load imbalance between low
utilisation SD_NUMA domains v4
On Tue, Jan 14, 2020 at 10:13:20AM +0000, Mel Gorman wrote:
> Changelog since V3
> o Allow a fixed imbalance a basic comparison with 2 tasks. This turned out to
> be as good or better than allowing an imbalance based on the group weight
> without worrying about potential spillover of the lower scheduler domains.
>
> Changelog since V2
> o Only allow a small imbalance when utilisation is low to address reports that
> higher utilisation workloads were hitting corner cases.
>
> Changelog since V1
> o Alter code flow vincent.guittot
> o Use idle CPUs for comparison instead of sum_nr_running vincent.guittot
> o Note that the division is still in place. Without it and taking
> imbalance_adj into account before the cutoff, two NUMA domains
> do not converage as being equally balanced when the number of
> busy tasks equals the size of one domain (50% of the sum).
>
> The CPU load balancer balances between different domains to spread load
> and strives to have equal balance everywhere. Communicating tasks can
> migrate so they are topologically close to each other but these decisions
> are independent. On a lightly loaded NUMA machine, two communicating tasks
> pulled together at wakeup time can be pushed apart by the load balancer.
> In isolation, the load balancer decision is fine but it ignores the tasks
> data locality and the wakeup/LB paths continually conflict. NUMA balancing
> is also a factor but it also simply conflicts with the load balancer.
>
> This patch allows a fixed degree of imbalance of two tasks to exist
> between NUMA domains regardless of utilisation levels. In many cases,
> this prevents communicating tasks being pulled apart. It was evaluated
> whether the imbalance should be scaled to the domain size. However, no
> additional benefit was measured across a range of workloads and machines
> and scaling adds the risk that lower domains have to be rebalanced. While
> this could change again in the future, such a change should specify the
> use case and benefit.
>
Any thoughts on whether this is ok for tip or are there suggestions on
an alternative approach?
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists