[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2551684c-c987-b143-ba69-4fb0c55f61c7@arm.com>
Date: Tue, 21 Dec 2021 13:46:01 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Vincent Donnefort <vincent.donnefort@....com>
Cc: peterz@...radead.org, mingo@...hat.com, vincent.guittot@...aro.org,
linux-kernel@...r.kernel.org, valentin.schneider@....com,
morten.rasmussen@....com, chris.redpath@....com,
qperret@...gle.com, lukasz.luba@....com
Subject: Re: [PATCH 2/4] sched/fair: Decay task PELT values during migration
On 20.12.21 17:09, Vincent Donnefort wrote:
> On Mon, Dec 20, 2021 at 12:26:23PM +0100, Dietmar Eggemann wrote:
>> On 09.12.21 17:11, Vincent Donnefort wrote:
[...]
>> Why do you use `avg.last_update_time` (lut) of the root cfs_rq here?
>>
>> p's lut was just synced to cfs_rq_of(se)'s lut in
>>
>> migrate_task_rq_fair() (1) -> remove_entity_load_avg() ->
>> sync_entity_load_avg(se) (2)
>
> Huum, indeed, the estimation is an offset on top of the se's last_update_time,
> which I suppose could be different from the rq's cfs_rq.
>
> I'll add a sched_entity argument for this function, to use either cfs_rq_of(se)
> or se last_update_time
OK, or an `u64 now or lut`.
[...]
>>> } else {
>>> + remove_entity_load_avg(se);
>>> +
>>> /*
>>> - * We are supposed to update the task to "current" time, then
>>> - * its up to date and ready to go to new CPU/cfs_rq. But we
>>> - * have difficulty in getting what current time is, so simply
>>> - * throw away the out-of-date time. This will result in the
>>> - * wakee task is less decayed, but giving the wakee more load
>>> - * sounds not bad.
>>> + * Here, the task's PELT values have been updated according to
>>> + * the current rq's clock. But if that clock hasn't been
>>> + * updated in a while, a substantial idle time will be missed,
>>> + * leading to an inflation after wake-up on the new rq.
>>> + *
>>> + * Estimate the PELT clock lag, and update sched_avg to ensure
>>> + * PELT continuity after migration.
>>> */
>>> - remove_entity_load_avg(&p->se);
>>> + __update_load_avg_blocked_se(rq_clock_pelt_estimator(rq), se);
>>
>> We do __update_load_avg_blocked_se() now twice for p, 1. in (2) and then
>> in (1) again.
>
> the first __update_load_avg_blocked_se() ensures the se is aligned with the
> cfs_rq's clock and then, update the "removed" struct accordingly. We couldn't
> use the estimator there, it would break that structure.
You're right. I missed this bit.
Related to this: Looks like on CAS/EAS we don't rely on
remove_entity_load_avg()->sync_entity_load_avg(se) since it is already
called during select_task_rq().
Powered by blists - more mailing lists