lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56FEF621.3070404@linaro.org>
Date:	Fri, 1 Apr 2016 15:28:49 -0700
From:	Steve Muckle <steve.muckle@...aro.org>
To:	Peter Zijlstra <peterz@...radead.org>, Leo Yan <leo.yan@...aro.org>
Cc:	Ingo Molnar <mingo@...hat.com>,
	Morten Rasmussen <morten.rasmussen@....com>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	linux-kernel@...r.kernel.org, eas-dev@...ts.linaro.org
Subject: Re: [PATCH RFC] sched/fair: let cpu's cfs_rq to reflect task
 migration

On 04/01/2016 12:49 PM, Peter Zijlstra wrote:
> On Sat, Apr 02, 2016 at 12:38:37AM +0800, Leo Yan wrote:
>> When task is migrated from CPU_A to CPU_B, scheduler will decrease
>> the task's load/util from the task's cfs_rq and also add them into
>> migrated cfs_rq. But if kernel enables CONFIG_FAIR_GROUP_SCHED then this
>> cfs_rq is not the same one with cpu's cfs_rq. As a result, after task is
>> migrated to CPU_B, then CPU_A still have task's stale value for
>> load/util; on the other hand CPU_B also cannot reflect new load/util
>> which introduced by the task.
>>
>> So this patch is to operate the task's load/util to cpu's cfs_rq, so
>> finally cpu's cfs_rq can really reflect task's migration.
> 
> Sorry but that is unintelligible.
> 
> What is the problem? Why do we care? How did you fix it? and at what
> cost?

I think I follow - Leo please correct me if I mangle your intentions.
It's an issue that Morten and Dietmar had mentioned to me as well.

Assume CONFIG_FAIR_GROUP_SCHED is enabled and a task is running in a
task group other than the root. The task migrates from one CPU to
another. The cfs_rq.avg structures on the src and dest CPUs
corresponding to the group containing the task are updated immediately
via remove_entity_load_avg()/update_cfs_rq_load_avg() and
attach_entity_load_avg(). But the cfs_rq.avg corresponding to the root
on the src and dest are not immediately updated. The root cfs_rq.avg
must catch up over time with PELT.

As to why we care, there's at least one issue which may or may not be
Leo's - the root cfs_rq is the one whose avg structure we read from to
determine the frequency with schedutil. If you have a cpu-bound task in
a non-root cgroup which periodically migrates among CPUs on an otherwise
idle system, I believe each time it migrates the frequency would drop to
fmin and have to ramp up again with PELT.

Leo I noticed you did not modify detach_entity_load_average(). I think
this would be needed to avoid the task's stats being double counted for
a while after switched_from_fair() or task_move_group_fair().

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ