[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251223172733.qjqjkwdhjoba7g4c@airbuntu>
Date: Tue, 23 Dec 2025 17:27:33 +0000
From: Qais Yousef <qyousef@...alina.io>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Samuel Wu <wusamuel@...gle.com>, mingo@...hat.com, peterz@...radead.org,
juri.lelli@...hat.com, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
vschneid@...hat.com, linux-kernel@...r.kernel.org,
Android Kernel Team <kernel-team@...roid.com>
Subject: Re: [PATCH] sched/fair: Fix pelt lost idle time detection
On 12/13/25 04:54, Vincent Guittot wrote:
> > For completeness, here are some Perfetto traces that show threads
> > running, CPU frequency, and PELT related stats. I've pinned the
> > util_avg track for a CPU on the little cluster, as the util_avg metric
> > shows an obvious increase (~66 vs ~3 for with patch and without patch
> > respectively).
>
> I was focusing on the update of rq->lost_idle_time but It can't be
> related because the CPUs are often idle in your trace. But it also
> updates the rq->clock_idle and rq->clock_pelt_idle which are used to
> sync cfs task util_avg at wakeup when it is about to migrate and prev
> cpu is idle.
>
> before the patch we could have old clock_pelt_idle and clock_idle that
> were used to decay the util_avg of cfs task before migrating them
> which would ends up with decaying too much util_avg
>
> But I noticed that you put the util_avg_rt which doesn't use the 2
> fields above in mainline. Does android kernel make some changes for rt
> util_avg tracking ?
We shouldn't be doing that. I think we were not updating RT pressure correctly
before the patch. The new values make more sense to me as RT tasks are running
2ms every 10ms and a util_avg_rt of ~150 range makes more sense than the
previous 5-6 values? If we add the 20% headroom that can easily saturate the
little core.
update_rt_rq_load_avg() uses rq_clock_pelt() which takes into account the
lost_idle_time which we now ensure is updated in this corner case?
I guess the first question is which do you think is the right behavior for the
RT pressure?
And 2nd question, does it make sense to take RT pressure into account in
schedutil if there are no fair tasks? It is supposed to help compensate for the
stolen time by RT so we make fair run faster. But if there are no fair tasks,
the RT pressure is meaningless on its own as they should run at max or whatever
value specified by uclamp_min? I think in this test uclamp_min is set to 0 by
default for RT, so expected not to cause frequency to rise on their own.
>
> >
> > - with patch: https://ui.perfetto.dev/#!/?s=964594d07a5a5ba51a159ba6c90bb7ab48e09326
> > - without patch:
> > https://ui.perfetto.dev/#!/?s=6ff6854c87ea187e4ca488acd2e6501b90ec9f6f
Powered by blists - more mailing lists