[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5250a0fc-3470-b313-0810-5d7a68c7cf50@arm.com>
Date: Thu, 9 Mar 2023 10:09:39 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Zhang Qiao <zhangqiao22@...wei.com>,
Vincent Guittot <vincent.guittot@...aro.org>
Cc: linux-kernel@...r.kernel.org, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, rostedt@...dmis.org,
bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
vschneid@...hat.com, rkagan@...zon.de
Subject: Re: [PATCH v2] sched/fair: sanitize vruntime of entity being migrated
On 09/03/2023 09:37, Zhang Qiao wrote:
>
> 在 2023/3/8 20:55, Vincent Guittot 写道:
>> Le mercredi 08 mars 2023 à 09:01:05 (+0100), Vincent Guittot a écrit :
>>> On Tue, 7 Mar 2023 at 14:41, Zhang Qiao <zhangqiao22@...wei.com> wrote:
[...]
>>>> 在 2023/3/7 18:26, Vincent Guittot 写道:
>>>>> On Mon, 6 Mar 2023 at 14:53, Vincent Guittot <vincent.guittot@...aro.org> wrote:
>>>>>>
>>>>>> On Mon, 6 Mar 2023 at 13:57, Zhang Qiao <zhangqiao22@...wei.com> wrote:
[...]
>> +static inline bool migrate_long_sleeper(struct sched_entity *se)
>> +{
>> + struct cfs_rq *cfs_rq;
>> + u64 sleep_time;
>> +
>> + if (se->exec_start == 0)
>
> How about use `se->avg.last_update_time == 0` here?
IMHO, both checks are not needed here since we're still dealing with the
originating CPU of the migration. Both of them are set to 0 only at the
end of migrate_task_rq_fair().
>> + return false;
>> +
>> + cfs_rq = cfs_rq_of(se);
>> + /*
>> + * If the entity slept for a long time, don't even try to normalize its
>> + * vruntime with the base as it may be too far off and might generate
>> + * wrong decision because of s64 overflow.
>> + * We estimate its sleep duration with the last update of se's pelt.
>> + * The last update happened before sleeping. The cfs' pelt is not
>> + * always updated when cfs is idle but this is not a problem because
>> + * its min_vruntime is not updated too, so the situation can't get
>> + * worse.
>> + */
>> + sleep_time = cfs_rq_last_update_time(cfs_rq) - se->avg.last_update_time;
Looks like this doesn't work for asymmetric CPU capacity systems since
we specifically do a sync_entity_load_avg() in select_task_rq_fair()
(find_energy_efficient_cpu() for EAS and select_idle_sibling() for CAS)
to sync cfs_rq and se (including their last_update_time).
[...]
Powered by blists - more mailing lists