[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.22.394.2010211732230.8475@hadrien>
Date: Wed, 21 Oct 2020 17:33:42 +0200 (CEST)
From: Julia Lawall <julia.lawall@...ia.fr>
To: Vincent Guittot <vincent.guittot@...aro.org>
cc: Mel Gorman <mgorman@...e.de>, Ingo Molnar <mingo@...hat.com>,
kernel-janitors@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
linux-kernel <linux-kernel@...r.kernel.org>,
Valentin Schneider <valentin.schneider@....com>,
Gilles Muller <Gilles.Muller@...ia.fr>
Subject: Re: [PATCH] sched/fair: check for idle core
On Wed, 21 Oct 2020, Vincent Guittot wrote:
> On Wed, 21 Oct 2020 at 17:18, Julia Lawall <julia.lawall@...ia.fr> wrote:
> >
> >
> >
> > On Wed, 21 Oct 2020, Mel Gorman wrote:
> >
> > > On Wed, Oct 21, 2020 at 03:24:48PM +0200, Julia Lawall wrote:
> > > > > I worry it's overkill because prev is always used if it is idle even
> > > > > if it is on a node remote to the waker. It cuts off the option of a
> > > > > wakee moving to a CPU local to the waker which is not equivalent to the
> > > > > original behaviour.
> > > >
> > > > But it is equal to the original behavior in the idle prev case if you go
> > > > back to the runnable load average days...
> > > >
> > >
> > > It is similar but it misses the sync treatment and sd->imbalance_pct part of
> > > wake_affine_weight which has unpredictable consequences. The data
> > > available is only on the fully utilised case.
> >
> > OK, what if my patch were:
> >
> > @@ -5800,6 +5800,9 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
> > if (sync && cpu_rq(this_cpu)->nr_running == 1)
> > return this_cpu;
> >
> > + if (!sync && available_idle_cpu(prev_cpu))
> > + return prev_cpu;
> > +
>
> this is not useful because when prev_cpu is idle, its runnable_avg was
> null so the only
> way for this_cpu to be selected by wake_affine_weight is to be null
> too which is not really
> possible when sync is set because sync is used to say, the running
> task on this cpu
> is about to sleep
OK, I agree. Previously prev_eff_load was 0 when prev was idle, and
whether the sync code is executed in wake_affine_weight or not, it will
not b the case that this_eff_load < prev_eff_load, so this will not be
selected.
julia
>
> > return nr_cpumask_bits;
> > }
> >
> > The sd->imbalance_pct part would have previously been a multiplication by
> > 0, so it doesn't need to be taken into account.
> >
> > julia
> >
> > >
> > > > The problem seems impossible to solve, because there is no way to know by
> > > > looking only at prev and this whether the thread would prefer to stay
> > > > where it was or go to the waker.
> > > >
> > >
> > > Yes, this is definitely true. Looking at prev_cpu and this_cpu is a
> > > crude approximation and the path is heavily limited in terms of how
> > > clever it can be.
> > >
> > > --
> > > Mel Gorman
> > > SUSE Labs
> > >
>
Powered by blists - more mailing lists