[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191219100232.GY2844@hirez.programming.kicks-ass.net>
Date: Thu, 19 Dec 2019 11:02:32 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Valentin Schneider <valentin.schneider@....com>
Cc: Mel Gorman <mgorman@...hsingularity.net>,
Vincent Guittot <vincent.guittot@...aro.org>,
Ingo Molnar <mingo@...nel.org>, pauld@...hat.com,
srikar@...ux.vnet.ibm.com, quentin.perret@....com,
dietmar.eggemann@....com, Morten.Rasmussen@....com,
hdanton@...a.com, parth@...ux.ibm.com, riel@...riel.com,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched, fair: Allow a small degree of load imbalance
between SD_NUMA domains
On Wed, Dec 18, 2019 at 06:50:52PM +0000, Valentin Schneider wrote:
> I'm quite sure you have reasons to have written it that way, but I was
> hoping we could squash it down to something like:
> ---
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 08a233e97a01..f05d09a8452e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8680,16 +8680,27 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
> env->migration_type = migrate_task;
> lsub_positive(&nr_diff, local->sum_nr_running);
> env->imbalance = nr_diff >> 1;
> - return;
> + } else {
> +
> + /*
> + * If there is no overload, we just want to even the number of
> + * idle cpus.
> + */
> + env->migration_type = migrate_task;
> + env->imbalance = max_t(long, 0, (local->idle_cpus -
> + busiest->idle_cpus) >> 1);
> }
>
> /*
> - * If there is no overload, we just want to even the number of
> - * idle cpus.
> + * Allow for a small imbalance between NUMA groups; don't do any
> + * of it if there is at least half as many tasks / busy CPUs as
> + * there are available CPUs in the busiest group
> */
> - env->migration_type = migrate_task;
> - env->imbalance = max_t(long, 0, (local->idle_cpus -
> - busiest->idle_cpus) >> 1);
> + if (env->sd->flags & SD_NUMA &&
> + (busiest->sum_nr_running < busiest->group_weight >> 1) &&
> + (env->imbalance < busiest->group_weight * (env->sd->imbalance_pct - 100) / 100))
Note that this form allows avoiding the division. Every time I see that
/100 I'm thinking we should rename and make imbalance_pct a base-2
thing.
> + env->imbalance = 0;
> +
> return;
> }
>
Powered by blists - more mailing lists