linux-kernel - Re: [PATCH 0/4] sched: remove cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131217181255.GG28621@e103034-lin>
Date:	Tue, 17 Dec 2013 18:12:55 +0000
From:	Morten Rasmussen <morten.rasmussen@....com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Alex Shi <alex.shi@...aro.org>,
	"mingo@...hat.com" <mingo@...hat.com>,
	"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
	"daniel.lezcano@...aro.org" <daniel.lezcano@...aro.org>,
	"fweisbec@...il.com" <fweisbec@...il.com>,
	"linux@....linux.org.uk" <linux@....linux.org.uk>,
	"tony.luck@...el.com" <tony.luck@...el.com>,
	"fenghua.yu@...el.com" <fenghua.yu@...el.com>,
	"tglx@...utronix.de" <tglx@...utronix.de>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"arjan@...ux.intel.com" <arjan@...ux.intel.com>,
	"pjt@...gle.com" <pjt@...gle.com>,
	"fengguang.wu@...el.com" <fengguang.wu@...el.com>,
	"james.hogan@...tec.com" <james.hogan@...tec.com>,
	"jason.low2@...com" <jason.low2@...com>,
	"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
	"hanjun.guo@...aro.org" <hanjun.guo@...aro.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/4] sched: remove cpu_load decay

On Tue, Dec 17, 2013 at 03:37:23PM +0000, Peter Zijlstra wrote:
> On Tue, Dec 17, 2013 at 02:04:57PM +0000, Morten Rasmussen wrote:
> > On Sat, Dec 14, 2013 at 01:27:59PM +0000, Alex Shi wrote:
> > > On 12/14/2013 04:03 AM, Peter Zijlstra wrote:
> > > > 
> > > > 
> > > > I had a quick peek at the actual patches.
> > > > 
> > > > afaict we're now using weighted_cpuload() aka runnable_load_avg as the
> > > > ->cpu_load. Whatever happened to also using the blocked_avg?
> > 
> > AFAICT, ->cpu_load is actually a snapshot value of weigthed_cpuload()
> > that gets updated occasionally. That has been the case since b92486cb.
> > By removing the cpu_load indexes {source,target}_load are now comparing
> > an old snapshot of weighted_cpuload() with the current value. I don't
> > think that really makes sense. 
> 
> Agreed, worse cpu_load is a very very recent snapshot, so there's not
> been much chance to really diverge much between when we last looked at
> it.
> 
> [ for busy load-balance, for newidle there might be since we can run
> between ticks ]
> 
> > weighted_cpuload() may change rapidly
> > when tasks are enqueued or dequeued so the old snapshot doesn't have
> > much meaning in my opinion. Maybe I'm missing something?
> 
> Right, which is where it makes sense to also account some of the blocked
> load, since that anticipates these arrivals/departures and should smooth
> out the over-all load pictures. Which is something that sounds right for
> balancing.
> 
> You don't want to really care too much about the high freq fluctuation,
> but care more about the longer term load.
> 
> Or rather -- and this is where the idx thing came from, you want a
> longer term view the bigger your sched_domain is. Since that balances
> nicely against the cost of actually moving tasks around.

That makes sense.

> 
> And while runnable_load_avg still includes high freq arrival/departure
> events, the runnable+blocked load should have much less of that.

Agreed, we either need a smooth version of runnable_load_avg or add the
blocked load (given that we fix the priority issue).

There is actually another long-term view of the cpu load in
rq->avg.runnable_avg_sum but I think it might be too conversative. Also
it doesn't track the weight of the tasks on the cpu, just whether the
cpu was idle or not.

> 
> > Comparing cpu_load indexes with different decay rates in {source,
> > target}_load() sort of make sense as it makes load-balancing decisions
> > more conservative.
> 
> *nod*
> 
> > I believe we have discussed using blocked_load_avg in weighted_cpuload()
> > in the past. While it seems to be the right thing to include it, it
> > causes problems related to the priority scaling of the task loads.
> > If you include a blocked load in the weighted_cpuload() and have tiny
> > (very low cpu utilization) task running at very high priority, your
> > weighted_cpuload() will be quite high and force other normal priority
> > tasks away from the cpu and leaving the cpu idle most of the time.
> 
> Ah, right. Which is where we should look at balancing utilization as
> well as weight.
> 
> Let me ponder this a bit more.

Yes. At least for Android devices this is a big deal.

Would it be too invasive to have an unweighted_cpuload() for balancing
utilization? It would require maintaining an unweighted version of
runnable_load_avg and blocked load.

Maybe you have better ideas.

> 
> > > 
> > > When enabling the sched_avg in load balance, I didn't find any positive
> > > testing result for several blocked_avg trying, just few regression. :(
> > > 
> > > And since this patchset is almost clean up only, no blocked_load_avg
> > > trying again...
> > 
> > My worry here is that I don't really understand why the current code
> > works when the decayed cpu_load has been removed.
> 
> Not too much different from before I think; but it does loose the longer
> term view on the bigger domains. That in turn makes it slightly more
> agressive, which can be good or bad depending on the workload (good on
> high spawn loads like hackbenchs, bad on more gentle stuff that has
> cache footprint).
> 
> > > > I totally hate patch 4; it seems like a random hack to make up for the
> > > > lack of blocked_avg.
> > > 
> > > Yes, this bias criteria seems a bit arbitrary. :)
> > 
> > This is why I think {source, target}_load() and their use need to be
> > reconsidered.
> 
> Aside from that, there's something entirely wrong with 4 in that we
> already have an imbalance between source and target loads, adding
> another basically random imbalance pass on top of that just doesn't make
> any kind of sense what so ff'ing ever.

Agreed.

Morten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/