linux-kernel - Re: [PATCH 4/4] sched/fair: reduce busy load balance interval

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKfTPtB+YM4B1XL5KPNg1pCP1q5z4+=qqDz2_r3v3jZgfXbmsA@mail.gmail.com>
Date:   Tue, 15 Sep 2020 14:42:49 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Jiang Biao <benbjiang@...il.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Valentin Schneider <valentin.schneider@....com>
Subject: Re: [PATCH 4/4] sched/fair: reduce busy load balance interval

On Tue, 15 Sep 2020 at 13:36, Jiang Biao <benbjiang@...il.com> wrote:
>
> Hi, Vincent
>
> On Tue, 15 Sep 2020 at 17:28, Vincent Guittot
> <vincent.guittot@...aro.org> wrote:
> >
> > On Tue, 15 Sep 2020 at 11:11, Jiang Biao <benbjiang@...il.com> wrote:
> > >
> > > Hi, Vincent
> > >
> > > On Mon, 14 Sep 2020 at 18:07, Vincent Guittot
> > > <vincent.guittot@...aro.org> wrote:
> > > >
> > > > The busy_factor, which increases load balance interval when a cpu is busy,
> > > > is set to 32 by default. This value generates some huge LB interval on
> > > > large system like the THX2 made of 2 node x 28 cores x 4 threads.
> > > > For such system, the interval increases from 112ms to 3584ms at MC level.
> > > > And from 228ms to 7168ms at NUMA level.
> > > Agreed that the interval is too big for that case.
> > > But would it be too small for an AMD environment(like ROME) with 8cpu
> > > at MC level(CCX), if we reduce busy_factor?
> >
> > Are you sure that this is too small ? As mentioned in the commit
> > message below, I tested it on small system (2x4 cores Arm64) and i
> > have seen some improvements
> Not so sure. :)
> Small interval means more frequent balances and more cost consumed for
> balancing, especially for pinned vm cases.

If you are running only pinned threads, the interval can increase
above 512ms which means 8sec after applying the busy factor

> For our case, we have AMD ROME servers made of 2node x 48cores x
> 2thread, and 8c at MC level(within a CCX). The 256ms interval seems a
> little too big for us, compared to Intel Cascadlake CPU with 48c at MC

so IIUC your topology is :
2 nodes at NUMA
6 CCX at DIE level
8 cores per CCX at MC
2 threads per core at SMT

> level, whose balance interval is 1536ms. 128ms seems a little more
> waste. :)

the 256ms/128ms interval only looks at 8 cores whereas the 1536
intervall looks for the whole 48 cores

> I guess more balance costs may hurt the throughput of sysbench like
> benchmark.. Just a guess.
>
> >
> > > For that case, the interval could be reduced from 256ms to 128ms.
> > > Or should we define an MIN_INTERVAL for MC level to avoid too small interval?
> >
> > What would be a too small interval ?
> That's hard to say. :)
> My guess is just for large server system cases.
>
> Thanks.
> Regards,
> Jiang