[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad78436f-43c3-4b4b-9cb5-28dffd43468a@arm.com>
Date: Thu, 2 Nov 2023 18:37:33 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Hongyan Xia <Hongyan.Xia2@....com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>
Cc: Qais Yousef <qyousef@...alina.io>,
Morten Rasmussen <morten.rasmussen@....com>,
Lukasz Luba <lukasz.luba@....com>,
Christian Loehle <christian.loehle@....com>,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 4/6] sched/fair: Rewrite util_fits_cpu()
On 04/10/2023 11:04, Hongyan Xia wrote:
> From: Hongyan Xia <hongyan.xia2@....com>
>
> Currently, there's no way to distinguish the difference between 1) a CPU
> that is actually maxed out at its highest frequency, or 2) one that is
> throttled because of UCLAMP_MAX, since both present util_avg values of
> 1024. This is problematic because when we try to pick a CPU for a task
> to run, we would like to give 2) a chance, or at least prefer 2) to 1).
>
> Current upstream gives neither a chance because the spare capacity is 0
> for either case. There are patches to fix this problem by considering 0
> capacities [1], but this might still be inefficient because this ends
> up treating 1) and 2) equally, and will always pick the same one because
> we don't change how we iterate through all CPUs. If we end up putting
> many tasks on 1), then this creates a seriously unbalanced load for the
> two CPUs.
>
> Fix by using util_avg_uclamp for util_fits_cpu(). This way, case 1) will
> still keep its utilization at 1024 whereas 2) shows spare capacities if
> the sum of util_avg_uclamp values is still under the CPU capacity.
> Note that this is roughly what the sum aggregation does in the Android
> kernel [2] (although we clamp UCLAMP_MIN as well in this patch, which
> may need some discussions), which shows superior energy savings because
> there's more chance that a task can get scheduled on 2) instead of
> finding a big CPU to run on.
>
> Under sum aggregation, checking whether a task fits a CPU becomes much
> simpler. We simply do fits_capacity() and there does not need to be code
> checking all corner cases for uclamp. This means util_fits_cpu() returns
> to true and false instead of tri-state, simplifying a significant amount
> of code.
You could remove util_fits_cpu() and task_fits_cpu() and call
fits_capacity() directly. We should try to keep the zoo of util-related
functions as small as possible.
[...]
Powered by blists - more mailing lists