[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7790c176-6af3-458c-9995-8bceb73b64bc@arm.com>
Date: Tue, 14 Nov 2023 13:59:02 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Hongyan Xia <hongyan.xia2@....com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>
Cc: Qais Yousef <qyousef@...alina.io>,
Morten Rasmussen <morten.rasmussen@....com>,
Lukasz Luba <lukasz.luba@....com>,
Christian Loehle <christian.loehle@....com>,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 1/6] sched/uclamp: Track uclamped util_avg in
sched_avg
On 09/11/2023 17:05, Hongyan Xia wrote:
> On 31/10/2023 15:52, Dietmar Eggemann wrote:
>> On 04/10/2023 11:04, Hongyan Xia wrote:
>>> From: Hongyan Xia <hongyan.xia2@....com>
>>
>> [...]
>>
>>> @@ -6445,6 +6450,21 @@ static int sched_idle_cpu(int cpu)
>>> }
>>> #endif
>>> +void ___update_util_avg_uclamp(struct sched_avg *avg, struct
>>> sched_entity *se);
>>
>> IMHO, `struct sched_avg *avg` can only be the one of a cfs_rq. So
>> passing a cfs_rq would eliminate the question whether this can be from
>> another se.
>
> At the moment, yes, the avg can only come from cfs_rq. The reason why I
> kept sched_avg is that once we have sum aggregation for RT tasks, it's
> very likely we will end up calling this function on rt_rq->avg, so
> having just sched_avg here will work for RT in the future.
Ah, OK. IMHO would be better to use cfs_rq for now and widen this
interface once RT is covered.
[...]
>> Question for me is why can't you integrate the util_avg_uclamp signals
>> for se's and cfs_rq's/rq's much closer into existing PELT functions?
>
> So the problem is that when we enqueue the task (say UCLAMP_MIN of 200),
> at that exact moment se->on_rq is false, so we only decay and not
> enforce UCLAMP_MIN. Further up in the hierarchy we do update_load_avg(),
> but the se of the task has already been processed so UCLAMP_MIN has not
> taken any effect. To make sure UCLAMP_MIN is immediately effective, I
> just re-evaluate the whole hierarchy from bottom to top.
>
> I probably didn't quite catch what you said here. Could you elaborate a
> bit on what 'much closer' means?
I see. But can we not use DO_ATTACH (and DO_DETACH) for this?
(flags & DO_ATTACH) characterizes the enqueuing of the task. So with
DO_ATTACH (and so !se->on_rq (and cfs_rq->curr != se)) we could (a)
decay the task and (b) add it to the cfs_rq in enqueue_entity() ->
update_load_avg(..., | DO_ATTACH).
Like we do for PELT and a wakeup migration enqueuing (a)
update_load_avg()
__update_load_avg_se()
if (___update_load_sum(..., cfs_rq->curr == se)
^^^^^^^^^^^^^^^^^^
'int running' for utilization
___update_load_avg()
if (!se->avg.last_update_time && (flags & DO_ATTACH))
^^^^^^^^^^^^^^^^^^^^^^^^^
wakeup migration
attach_entity_load_avg()
Notice, for PELT we attach/detach to/from the cfs_rq which gives us
blocked contribution. For util_avg_clamped we do enqueue/dequeue, so
only runnable contribution.
Powered by blists - more mailing lists