linux-kernel - Re: [PATCH 03/10] sched,fair: redefine runnable_load_avg as the sum of task_h

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <757e0af14b714b596417b31c45098fc314ed7c8a.camel@surriel.com>
Date:   Mon, 01 Jul 2019 12:47:35 -0400
From:   Rik van Riel <riel@...riel.com>
To:     Josef Bacik <josef@...icpanda.com>
Cc:     linux-kernel@...r.kernel.org, kernel-team@...com, pjt@...gle.com,
        dietmar.eggemann@....com, peterz@...radead.org, mingo@...hat.com,
        morten.rasmussen@....com, tglx@...utronix.de,
        mgorman@...hsingularity.net, vincent.guittot@...aro.org
Subject: Re: [PATCH 03/10] sched,fair: redefine runnable_load_avg as the sum
 of task_h_load

On Mon, 2019-07-01 at 12:29 -0400, Josef Bacik wrote:
> On Fri, Jun 28, 2019 at 04:49:06PM -0400, Rik van Riel wrote:
> > The runnable_load magic is used to quickly propagate information
> > about
> > runnable tasks up the hierarchy of runqueues. The runnable_load_avg
> > is
> > mostly used for the load balancing code, which only examines the
> > value at
> > the root cfs_rq.
> > 
> > Redefine the root cfs_rq runnable_load_avg to be the sum of
> > task_h_loads
> > of the runnable tasks. This works because the hierarchical
> > runnable_load of
> > a task is already equal to the task_se_h_load today. This provides
> > enough
> > information to the load balancer.
> > 
> > The runnable_load_avg of the cgroup cfs_rqs does not appear to be
> > used for anything, so don't bother calculating those.
> > 
> > This removes one of the things that the code currently traverses
> > the
> > cgroup hierarchy for, and getting rid of it brings us one step
> > closer
> > to a flat runqueue for the CPU controller.
> > 
> 
> My memory on this stuff is very hazy, but IIRC we had the
> runnable_sum and the
> runnable_avg separated out because you could have the avg lag behind
> the sum.
> So like you enqueue a bunch of new entities who's avg may have
> decayed a bunch
> and so their overall load is not felt on the CPU until they start
> running, and
> now you've overloaded that CPU.  The sum was there to make sure new
> things
> coming onto the CPU added actual load to the queue instead of looking
> like there
> was no load.
> 
> Is this going to be a problem now with this new code?

That is a good question!

On the one hand, you may well be right.

On the other hand, while I see the old code calculating
runnable_sum, I don't really see it _using_ it to drive
scheduling decisions.

It would be easy to define the CPU cfs_rq->runnable_load_sum
as being the sum of task_se_h_weight() of each runnable task
on the CPU (for example), but what would we use it for?

What am I missing?

> +static inline void
> > +update_runnable_load_avg(struct sched_entity *se)
> > +{
> > +	struct cfs_rq *root_cfs_rq = &cfs_rq_of(se)->rq->cfs;
> > +	long new_h_load, delta;
> > +
> > +	SCHED_WARN_ON(!entity_is_task(se));
> > +
> > +	if (!se->on_rq)
> > +		return;
> >  
> > -	sub_positive(&cfs_rq->avg.runnable_load_avg, se-
> > >avg.runnable_load_avg);
> > -	sub_positive(&cfs_rq->avg.runnable_load_sum,
> > -		     se_runnable(se) * se->avg.runnable_load_sum);
> > +	new_h_load = task_se_h_load(se);
> > +	delta = new_h_load - se->enqueued_h_load;
> > +	root_cfs_rq->avg.runnable_load_avg += delta;
> 
> Should we use add_positive here?  Thanks,

Yes, we should use add_positive. I'll do that for v3.

-- 
All Rights Reversed.

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)