linux-kernel - Re: [PATCH 3/4] sched/fair: minimize concurrent LBs between domain level

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtA5PHN=9ykqHd5MYJvTxdR_pdtZOO=gjsJ7AWfLnLzMag@mail.gmail.com>
Date:   Wed, 16 Sep 2020 08:54:40 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 3/4] sched/fair: minimize concurrent LBs between domain level

On Tue, 15 Sep 2020 at 21:04, Valentin Schneider
<valentin.schneider@....com> wrote:
>
>
> On 14/09/20 11:03, Vincent Guittot wrote:
> > sched domains tend to trigger simultaneously the load balance loop but
> > the larger domains often need more time to collect statistics. This
> > slowness makes the larger domain trying to detach tasks from a rq whereas
> > tasks already migrated somewhere else at a sub-domain level. This is not
> > a real problem for idle LB because the period of smaller domains will
> > increase with its CPUs being busy and this will let time for higher ones
> > to pulled tasks. But this becomes a problem when all CPUs are already busy
> > because all domains stay synced when they trigger their LB.
> >
> > A simple way to minimize simultaneous LB of all domains is to decrement the
> > the busy interval by 1 jiffies. Because of the busy_factor, the interval of
> > larger domain will not be a multiple of smaller ones anymore.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> > ---
> >  kernel/sched/fair.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 765be8273292..7d7eefd8e2d4 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9780,6 +9780,9 @@ get_sd_balance_interval(struct sched_domain *sd, int cpu_busy)
> >
> >       /* scale ms to jiffies */
> >       interval = msecs_to_jiffies(interval);
>
> A comment here would be nice, I think. What about:
>
> /*
>  * Reduce likelihood of (busy) balancing at higher domains racing with
>  * balancing at lower domains by preventing their balancing periods from being
>  * multiples of each other.
>  */

Yes a comment would be nice. Will add it

Thanks
>
> > +     if (cpu_busy)
> > +             interval -= 1;
> > +
> >       interval = clamp(interval, 1UL, max_load_balance_interval);
> >
> >       return interval;