lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YXxNYG6kfyXHP6J6@google.com>
Date:   Fri, 29 Oct 2021 15:37:04 -0400
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de, bristot@...hat.com, linux-kernel@...r.kernel.org,
        tim.c.chen@...ux.intel.com, dianders@...gle.com,
        qais.yousef@....com, Chris.Redpath@....com,
        Yicong Yang <yangyicong@...ilicon.com>,
        Barry Song <21cnbao@...il.com>
Subject: Re: [PATCH v3 0/5] Improve newidle lb cost tracking and early abort

Hi Vincent,

On Fri, Oct 29, 2021 at 09:49:23AM +0200, Vincent Guittot wrote:
> On Fri, 29 Oct 2021 at 01:25, Joel Fernandes <joel@...lfernandes.org> wrote:
> >
> > Hi,  Vincent, Peter,
> >
> > On Tue, Oct 19, 2021 at 02:35:32PM +0200, Vincent Guittot wrote:
> > > This patchset updates newidle lb cost tracking and early abort:
> > >
> > > The time spent running update_blocked_averages is now accounted in the 1st
> > > sched_domain level. This time can be significant and move the cost of
> > > newidle lb above the avg_idle time.
> > >
> > > The decay of max_newidle_lb_cost is modified to start only when the field
> > > has not been updated for a while. Recent update will not be decayed
> > > immediatlybut only after a while.
> > >
> > > The condition of an avg_idle lower than sysctl_sched_migration_cost has
> > > been removed as the 500us value is quite large and prevent opportunity to
> > > pull task on the newly idle CPU for at least 1st domain levels.
> >
> > It appears this series is not yet in upstream Linus's tree. What's the latest on it?
> >
> 
> I sent an addon yesterday to cover cases that Tim cares about

Oh ok, I'll check it out. Thanks for letting me know.

> > I see a lot of times on ARM64 devices that load balance is skipped due to the
> > high the sysctl_sched_migration_cost. I saw another thread as well where
> 
> Have you tested the patchset ? Does it enable more load balance on
> your platform ?

Yes I tested and noticed load balance happening more often. I ran some
workloads noticed it makes things smoother. I am doing more tests as well.

Also, we need this series to fix the update_blocked_averages() latency issues
and I verified that preemptoff tracer does not show same high latency with
this series.

thanks,

 - Joel

> > someone complained the performance varies and the default might be too high:
> > https://lkml.org/lkml/2021/9/14/150
> 
> Added  Yicong and Barry in the list
> 
> >
> > Any other thoughts? Happy to help on any progress on this series as well. Thanks,
> >
> >  - Joel
> >
> > >
> > > Monitoring sd->max_newidle_lb_cost on cpu0 of a Arm64 system
> > > THX2 (2 nodes * 28 cores * 4 cpus) during the benchmarks gives the
> > > following results:
> > >        min    avg   max
> > > SMT:   1us   33us  273us - this one includes the update of blocked load
> > > MC:    7us   49us  398us
> > > NUMA: 10us   45us  158us
> > >
> > >
> > > Some results for hackbench -l $LOOPS -g $group :
> > > group      tip/sched/core     + this patchset
> > > 1           15.189(+/- 2%)       14.987(+/- 2%)  +1%
> > > 4            4.336(+/- 3%)        4.322(+/- 5%)  +0%
> > > 16           3.654(+/- 1%)        2.922(+/- 3%) +20%
> > > 32           3.209(+/- 1%)        2.919(+/- 3%)  +9%
> > > 64           2.965(+/- 1%)        2.826(+/- 1%)  +4%
> > > 128          2.954(+/- 1%)        2.993(+/- 8%)  -1%
> > > 256          2.951(+/- 1%)        2.894(+/- 1%)  +2%
> > >
> > > tbench and reaim have not shown any difference
> > >
> > > Change since v2:
> > > - Update and decay of sd->last_decay_max_lb_cost are gathered in
> > >   update_newidle_cost(). The behavior remains almost the same except that
> > >   the decay can happen during newidle_balance now.
> > >
> > >   Tests results haven't shown any differences
> > >
> > >   I haven't modified rq->max_idle_balance_cost. It acts as the max value
> > >   for avg_idle and prevents the latter to reach high value during long
> > >   idle phase. Moving on an IIR filter instead, could delay the convergence
> > >   of avg_idle to a reasonnable value that reflect current situation.
> > >
> > > - Added a minor cleanup of newidle_balance
> > >
> > > Change since v1:
> > > - account the time spent in update_blocked_averages() in the 1st domain
> > >
> > > - reduce number of call of sched_clock_cpu()
> > >
> > > - change the way max_newidle_lb_cost is decayed. Peter suggested to use a
> > >   IIR but keeping a track of the current max value gave the best result
> > >
> > > - removed the condition (this_rq->avg_idle < sysctl_sched_migration_cost)
> > >   as suggested by Peter
> > >
> > > Vincent Guittot (5):
> > >   sched/fair: Account update_blocked_averages in newidle_balance cost
> > >   sched/fair: Skip update_blocked_averages if we are defering load
> > >     balance
> > >   sched/fair: Wait before decaying max_newidle_lb_cost
> > >   sched/fair: Remove sysctl_sched_migration_cost condition
> > >   sched/fair: cleanup newidle_balance
> > >
> > >  include/linux/sched/topology.h |  2 +-
> > >  kernel/sched/fair.c            | 65 ++++++++++++++++++++++------------
> > >  kernel/sched/topology.c        |  2 +-
> > >  3 files changed, 45 insertions(+), 24 deletions(-)
> > >
> > > --
> > > 2.17.1
> > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ