lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 19 Dec 2018 09:27:28 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Morten Rasmussen <Morten.Rasmussen@....com>
Subject: Re: [PATCH v2 3/3] sched/fair: fix unnecessary increase of balance interval

On Tue, 18 Dec 2018 at 15:09, Valentin Schneider
<valentin.schneider@....com> wrote:
>
> On 18/12/2018 13:23, Vincent Guittot wrote:
> [...]
> >> Ah, I think I get it: you're saying that this balance_interval increase
> >> is done because it is always assumed we do an active balance with
> >> busiest->nr_running > 1 && pinned tasks, and that all that is left to
> >> migrate after the active_balance is pinned tasks that we can't actually
> >> migrate.
> >>
> >> Maybe that's what we should target (as always, totally untested):
> >>
> >> -----8<-----
> >> @@ -9131,18 +9144,16 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> >>         } else
> >>                 sd->nr_balance_failed = 0;
> >>
> >> -       if (likely(!active_balance)) {
> >> -               /* We were unbalanced, so reset the balancing interval */
> >> -               sd->balance_interval = sd->min_interval;
> >> -       } else {
> >> +       if (unlikely(active_balance && (env.flags & LBF_ALL_PINNED))) {
> >
> > So it's not exactly LBF_ALL_PINNED but also some pinned tasks
> >
>
> Wouldn't LBF_ALL_PINNED cover all relevant cases? It's set at the very top
> of the 'if (busiest->nr_running > 1)' block and cleared whenever we find
> at least one task we could pull, even if we don't pull it because of
> other reasons in can_migrate_task() (e.g. cache hotness).
>
> If we have LBF_SOME_PINNED but not LBF_ALL_PINNED, we currently don't
> increase the balance_interval, which is what we would want to maintain.

But there are several other UC to do active migration and increase the interval
like all except running tasks are pinned

>
> > but IIUC, you would like to extend the reset of balance  interval  to
> > all the "good" reasons to trig active load balance
>
> Right
>
> > In fact the condition used in need_active_balance except the last
> > remaining one. Good reasons are:
> > - asym packing
> > - /*
> > * The dst_cpu is idle and the src_cpu CPU has only 1 CFS task.
> > * It's worth migrating the task if the src_cpu's capacity is reduced
> > * because of other sched_class or IRQs if more capacity stays
> > * available on dst_cpu.
> > */
> > - misfit task
> >
> > I can extend the test for these conditions
>
> So that's all the need_active_balance() cases except the last
> sd->nr_balance_failed one. I'd argue this could also be counted as a
> "good" reason to active balance which shouldn't lead to a balance_interval
> increase. Plus, it keeps to the logic of increasing the balance_interval
> only when task affinity gets in the way.

But this is some kind of affinity, the hotness is a way for the
scheduler to temporarily pinned the task on a cpu to take advantage of
cache hotness.

I would prefer to be conservative and only reset the interval when we
are sure that active load balance is really what we want to do.
Asym packing is one, we can add the misfit case and the move task on
cpu with more available capacity as well. For the other case, it's
less obvious and I would prefer to keep current behavior

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ