[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <757e0af14b714b596417b31c45098fc314ed7c8a.camel@surriel.com>
Date: Mon, 01 Jul 2019 12:47:35 -0400
From: Rik van Riel <riel@...riel.com>
To: Josef Bacik <josef@...icpanda.com>
Cc: linux-kernel@...r.kernel.org, kernel-team@...com, pjt@...gle.com,
dietmar.eggemann@....com, peterz@...radead.org, mingo@...hat.com,
morten.rasmussen@....com, tglx@...utronix.de,
mgorman@...hsingularity.net, vincent.guittot@...aro.org
Subject: Re: [PATCH 03/10] sched,fair: redefine runnable_load_avg as the sum
of task_h_load
On Mon, 2019-07-01 at 12:29 -0400, Josef Bacik wrote:
> On Fri, Jun 28, 2019 at 04:49:06PM -0400, Rik van Riel wrote:
> > The runnable_load magic is used to quickly propagate information
> > about
> > runnable tasks up the hierarchy of runqueues. The runnable_load_avg
> > is
> > mostly used for the load balancing code, which only examines the
> > value at
> > the root cfs_rq.
> >
> > Redefine the root cfs_rq runnable_load_avg to be the sum of
> > task_h_loads
> > of the runnable tasks. This works because the hierarchical
> > runnable_load of
> > a task is already equal to the task_se_h_load today. This provides
> > enough
> > information to the load balancer.
> >
> > The runnable_load_avg of the cgroup cfs_rqs does not appear to be
> > used for anything, so don't bother calculating those.
> >
> > This removes one of the things that the code currently traverses
> > the
> > cgroup hierarchy for, and getting rid of it brings us one step
> > closer
> > to a flat runqueue for the CPU controller.
> >
>
> My memory on this stuff is very hazy, but IIRC we had the
> runnable_sum and the
> runnable_avg separated out because you could have the avg lag behind
> the sum.
> So like you enqueue a bunch of new entities who's avg may have
> decayed a bunch
> and so their overall load is not felt on the CPU until they start
> running, and
> now you've overloaded that CPU. The sum was there to make sure new
> things
> coming onto the CPU added actual load to the queue instead of looking
> like there
> was no load.
>
> Is this going to be a problem now with this new code?
That is a good question!
On the one hand, you may well be right.
On the other hand, while I see the old code calculating
runnable_sum, I don't really see it _using_ it to drive
scheduling decisions.
It would be easy to define the CPU cfs_rq->runnable_load_sum
as being the sum of task_se_h_weight() of each runnable task
on the CPU (for example), but what would we use it for?
What am I missing?
> +static inline void
> > +update_runnable_load_avg(struct sched_entity *se)
> > +{
> > + struct cfs_rq *root_cfs_rq = &cfs_rq_of(se)->rq->cfs;
> > + long new_h_load, delta;
> > +
> > + SCHED_WARN_ON(!entity_is_task(se));
> > +
> > + if (!se->on_rq)
> > + return;
> >
> > - sub_positive(&cfs_rq->avg.runnable_load_avg, se-
> > >avg.runnable_load_avg);
> > - sub_positive(&cfs_rq->avg.runnable_load_sum,
> > - se_runnable(se) * se->avg.runnable_load_sum);
> > + new_h_load = task_se_h_load(se);
> > + delta = new_h_load - se->enqueued_h_load;
> > + root_cfs_rq->avg.runnable_load_avg += delta;
>
> Should we use add_positive here? Thanks,
Yes, we should use add_positive. I'll do that for v3.
--
All Rights Reversed.
Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)
Powered by blists - more mailing lists