linux-kernel - Re: [PATCH RFC] sched/fair: let cpu's cfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160405065644.GA29778@leoy-linaro>
Date:	Tue, 5 Apr 2016 14:56:44 +0800
From:	Leo Yan <leo.yan@...aro.org>
To:	Morten Rasmussen <morten.rasmussen@....com>
Cc:	Steve Muckle <steve.muckle@...aro.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	linux-kernel@...r.kernel.org, eas-dev@...ts.linaro.org
Subject: Re: [PATCH RFC] sched/fair: let cpu's cfs_rq to reflect task
 migration

On Mon, Apr 04, 2016 at 09:48:23AM +0100, Morten Rasmussen wrote:
> On Sat, Apr 02, 2016 at 03:11:54PM +0800, Leo Yan wrote:
> > On Fri, Apr 01, 2016 at 03:28:49PM -0700, Steve Muckle wrote:
> > > I think I follow - Leo please correct me if I mangle your intentions.
> > > It's an issue that Morten and Dietmar had mentioned to me as well.
> 
> Yes. We have been working on this issue for a while without getting to a
> nice solution yet.

Good to know this. This patch is mainly for discussion purpose.

[...]

> > > Leo I noticed you did not modify detach_entity_load_average(). I think
> > > this would be needed to avoid the task's stats being double counted for
> > > a while after switched_from_fair() or task_move_group_fair().
> 
> I'm afraid that the solution to problem is more complicated than that
> :-(
> 
> You are adding/removing a contribution from the root cfs_rq.avg which
> isn't part of the signal in the first place. The root cfs_rq.avg only
> contains the sum of the load/util of the sched_entities on the cfs_rq.
> If you remove the contribution of the tasks from there you may end up
> double-accounting for the task migration. Once due to you patch and then
> again slowly over time as the group sched_entity starts reflecting that
> the task has migrated. Furthermore, for group scheduling to make sense
> it has to be the task_h_load() you add/remove otherwise the group
> weighting is completely lost. Or am I completely misreading your patch?

Here have one thing want to confirm firstly: though CFS has maintained
task group's hierarchy, but between task group's cfs_rq.avg and root
cfs_rq.avg, CFS updates these signals independently rather than accouting
them by crossing the hierarchy.

So currently CFS decreases the group's cfs_rq.avg for task's migration,
but it don't iterate task group's hierarchy to root cfs_rq.avg. I
don't understand your meantioned the second accounting by "then again
slowly over time as the group sched_entity starts reflecting that the
task has migrated."

Another question is: does cfs_rq.avg _ONLY_ signal historic behavior but
not present behavior? so even the task has been migrated we still need
decay it slowly? Or this will be different between load and util?

> I don't think the slow response time for _load_ is necessarily a big
> problem. Otherwise we would have had people complaining already about
> group scheduling being broken. It is however a problem for all the
> initiatives that built on utilization.

Or maybe we need seperate utilization and load, these two signals
have different semantics and purpose.

Thanks,
Leo Yan