lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <jhjimce6ev2.mognet@arm.com>
Date:   Tue, 15 Sep 2020 20:04:33 +0100
From:   Valentin Schneider <valentin.schneider@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] sched/fair: reduce minimal imbalance threshold


On 14/09/20 11:03, Vincent Guittot wrote:
> The 25% default imbalance threshold for DIE and NUMA domain is large
> enough to generate significant unfairness between threads. A typical
> example is the case of 11 threads running on 2x4 CPUs. The imbalance of
> 20% between the 2 groups of 4 cores is just low enough to not trigger
> the load balance between the 2 groups. We will have always the same 6
> threads on one group of 4 CPUs and the other 5 threads on the other
> group of CPUS. With a fair time sharing in each group, we ends up with
> +20% running time for the group of 5 threads.
>

AIUI this is the culprit:

                if (100 * busiest->avg_load <=
                                env->sd->imbalance_pct * local->avg_load)
                        goto out_balanced;

As in your case imbalance_pct=120 becomes the tipping point.

Now, ultimately this would need to scale based on the underlying topology,
right? If you have a system with 2x32 cores running {33 threads, 34
threads}, the tipping point becomes imbalance_pct≈103; but then since you
have this many more cores, it is somewhat questionable.

> Consider decreasing the imbalance threshold for overloaded case where we
> use the load to balance task and to ensure fair time sharing.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> ---
>  kernel/sched/topology.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 9079d865a935..1a84b778755d 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1337,7 +1337,7 @@ sd_init(struct sched_domain_topology_level *tl,
>               .min_interval		= sd_weight,
>               .max_interval		= 2*sd_weight,
>               .busy_factor		= 32,
> -		.imbalance_pct		= 125,
> +		.imbalance_pct		= 117,
>
>               .cache_nice_tries	= 0,

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ