[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d1b613d5-2207-45dd-8aa2-a276502ccaa1@arm.com>
Date: Tue, 31 Oct 2023 16:52:23 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Hongyan Xia <Hongyan.Xia2@....com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>
Cc: Qais Yousef <qyousef@...alina.io>,
Morten Rasmussen <morten.rasmussen@....com>,
Lukasz Luba <lukasz.luba@....com>,
Christian Loehle <christian.loehle@....com>,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 1/6] sched/uclamp: Track uclamped util_avg in
sched_avg
On 04/10/2023 11:04, Hongyan Xia wrote:
> From: Hongyan Xia <hongyan.xia2@....com>
[...]
> @@ -6445,6 +6450,21 @@ static int sched_idle_cpu(int cpu)
> }
> #endif
>
> +void ___update_util_avg_uclamp(struct sched_avg *avg, struct sched_entity *se);
IMHO, `struct sched_avg *avg` can only be the one of a cfs_rq. So
passing a cfs_rq would eliminate the question whether this can be from
another se.
> +static void update_se_chain(struct task_struct *p)
> +{
> +#ifdef CONFIG_UCLAMP_TASK
> + struct sched_entity *se = &p->se;
> + struct rq *rq = task_rq(p);
> +
> + for_each_sched_entity(se) {
> + struct cfs_rq *cfs_rq = cfs_rq_of(se);
> +
> + ___update_util_avg_uclamp(&cfs_rq->avg, se);
> + }
> +#endif
> +}
> /*
> * The enqueue_task method is called before nr_running is
> * increased. Here we update the fair scheduling stats and
> @@ -6511,6 +6531,16 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> goto enqueue_throttle;
> }
>
> + /*
> + * Re-evaluate the se hierarchy now that on_rq is true. This is
> + * important to enforce uclamp the moment a task with uclamp is
> + * enqueued, rather than waiting a timer tick for uclamp to kick in.
> + *
> + * XXX: This duplicates some of the work already done in the above for
> + * loops.
> + */
> + update_se_chain(p);
I try to figure out why this is necessary here:
enqueue_task_fair()
for_each_sched_entity()
enqueue_entity()
update_load_avg()
__update_load_avg_se()
___update_util_avg_uclamp() <-- if se->on_rq,
update p (se) if we
cross PELT period (1)
boundaries
___decay_util_avg_uclamp_towards() <-- decay p (se) (2)
enqueue_util_avg_uclamp() <-- enqueue p into cfs_rq (3)
se->on_rq = 1 <-- set p (se) on_rq (4)
for_each_sched_entity()
update_load_avg() <-- update all on_rq se's
in the hierarchy (5)
update_se_chain() <-- update p (se) and its
se hierarchy (6)
(1) Skip p since it is !se->on_rq.
(2) Decay p->se->avg.util_avg_uclamp to 0 since it was sleeping.
(3) Attach p to its cfs_rq
...
(6) Update all all se's and cfs_rq's involved in p's task_group
hierarchy (including propagation from se (level=x+1) to cfs_rq
(level=x))
Question for me is why can't you integrate the util_avg_uclamp signals
for se's and cfs_rq's/rq's much closer into existing PELT functions?
[...]
Powered by blists - more mailing lists