linux-kernel - Re: [PATCH 2/2] sched: update runqueue clock before migrations away

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <xm26vbynmk2y.fsf@sword-of-the-dawn.mtv.corp.google.com>
Date:	Tue, 17 Dec 2013 10:03:01 -0800
From:	bsegall@...gle.com
To:	Chris Redpath <chris.redpath@....com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	"pjt\@google.com" <pjt@...gle.com>,
	"mingo\@redhat.com" <mingo@...hat.com>,
	"alex.shi\@linaro.org" <alex.shi@...aro.org>,
	Morten Rasmussen <Morten.Rasmussen@....com>,
	Dietmar Eggemann <Dietmar.Eggemann@....com>,
	"linux-kernel\@vger.kernel.org" <linux-kernel@...r.kernel.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [PATCH 2/2] sched: update runqueue clock before migrations away

Chris Redpath <chris.redpath@....com> writes:

> On 12/12/13 18:24, Peter Zijlstra wrote:
>> Would pre_schedule_idle() -> rq_last_tick_reset() -> rq->last_sched_tick
>> be useful?
>>
>> I suppose we could easily lift that to NO_HZ_COMMON.
>>
>
> Many thanks for the tip Peter, I have tried this out and it does provide enough
> information to be able to correct the problem. The new version doesn't update
> the rq, just carries the extra unaccounted time (estimated from the jiffies)
> over to be processed during enqueue.
>
> However before I send a new patch set I have a question about the existing
> behavior. Ben, you may already know the answer to this?
>
> During a wake migration we call __synchronize_entity_decay in
> migrate_task_rq_fair, which will decay avg.runnable_avg_sum. We also record the
> amount of periods we decayed for as a negative number in avg.decay_count.
>
> We then enqueue the task on its target runqueue, and again we decay the load by
> the number of periods it has been off-rq.
>
> if (unlikely(se->avg.decay_count <= 0)) {
> 	se->avg.last_runnable_update = rq_clock_task(rq_of(cfs_rq));
> 	if (se->avg.decay_count) {
> 		se->avg.last_runnable_update -= (-se->avg.decay_count)
> 							<< 20;
>>>>		update_entity_load_avg(se, 0);
>
> Am I misunderstanding how this is supposed to work or have we been always
> double-accounting sleep time for wake migrations?

__synchronize_entity_decay will decay load_avg_contrib in order to
figure out how much to remove from old_cfs_rq->blocked_load.
update_entity_load_avg will update the underlying runnable_avg_sum/period that
is used to update load_avg_contrib.

(Normally we update runnable_avg_sum, which updates load_avg_contrib via
__update_entity_load_avg_contrib. Here we go in the reverse direction
because we don't hold the right rq locks at the right times.)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/