lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Sep 2020 08:53:32 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/4] sched/fair: reduce minimal imbalance threshold

On Tue, 15 Sep 2020 at 21:04, Valentin Schneider
<valentin.schneider@....com> wrote:
>
>
> On 14/09/20 11:03, Vincent Guittot wrote:
> > The 25% default imbalance threshold for DIE and NUMA domain is large
> > enough to generate significant unfairness between threads. A typical
> > example is the case of 11 threads running on 2x4 CPUs. The imbalance of
> > 20% between the 2 groups of 4 cores is just low enough to not trigger
> > the load balance between the 2 groups. We will have always the same 6
> > threads on one group of 4 CPUs and the other 5 threads on the other
> > group of CPUS. With a fair time sharing in each group, we ends up with
> > +20% running time for the group of 5 threads.
> >
>
> AIUI this is the culprit:
>
>                 if (100 * busiest->avg_load <=
>                                 env->sd->imbalance_pct * local->avg_load)
>                         goto out_balanced;
>
> As in your case imbalance_pct=120 becomes the tipping point.
>
> Now, ultimately this would need to scale based on the underlying topology,
> right? If you have a system with 2x32 cores running {33 threads, 34
> threads}, the tipping point becomes imbalance_pct≈103; but then since you
> have this many more cores, it is somewhat questionable.

I wanted to stay conservative and to not trigger too much task
migration because of small imbalance so I decided to decrease the
default threshold to the same level as the MC groups but this can
still generate unfairness. With your example of 2x32 cores, if you end
up with 33 tasks in one group and 38 in the other one, the system is
overloaded so you use load and imbalance_pct but the imbalance will
stay below the new threshold and the 33 tasks will have 13% more
running time.

This new imbalance_pct seems a reasonable step to decrease the unfairness

>
> > Consider decreasing the imbalance threshold for overloaded case where we
> > use the load to balance task and to ensure fair time sharing.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> > ---
> >  kernel/sched/topology.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> > index 9079d865a935..1a84b778755d 100644
> > --- a/kernel/sched/topology.c
> > +++ b/kernel/sched/topology.c
> > @@ -1337,7 +1337,7 @@ sd_init(struct sched_domain_topology_level *tl,
> >               .min_interval           = sd_weight,
> >               .max_interval           = 2*sd_weight,
> >               .busy_factor            = 32,
> > -             .imbalance_pct          = 125,
> > +             .imbalance_pct          = 117,
> >
> >               .cache_nice_tries       = 0,

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ