[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b781baf0-7c7e-41c3-a2b6-a2d620df3210@arm.com>
Date: Fri, 13 Sep 2024 15:21:31 +0200
From: Pierre Gondois <pierre.gondois@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
vschneid@...hat.com, lukasz.luba@....com, rafael.j.wysocki@...el.com,
linux-kernel@...r.kernel.org
Cc: qyousef@...alina.io, hongyan.xia2@....com
Subject: Re: [PATCH 1/5] sched/fair: Filter false overloaded_group case for
EAS
Hello Vincent,
I have been trying this patch with the following workload, on a Pixel6
(4 littles, 2 mid, 2 big):
a. 5 tasks with: [UCLAMP_MIN:0, UCLAMP_MAX:1, duty_cycle=100%, cpuset:0-2]
b. 1 task with: [duty_cycle=100%, cpuset:0-7] but starting on CPU4
a.
There are many UCLAMP_MAX task also to pass the following condition
to tag a group as overloaded.
group_is_overloaded()
\-(sgs->sum_nr_running <= sgs->group_weight)
These tasks should put
b. to see if a CPU-bound task is migrated to the big cluster.
---
- Without patch 5 [RFC PATCH 5/5] sched/fair: Add push task callback for EAS
- Without this patch
The migration is effectively due to the load_balancer selecting the
little cluster over the mid cluster.
The little cluster put the system in an overutilized state.
---
- Without patch 5 [RFC PATCH 5/5] sched/fair: Add push task callback for EAS
- With this patch
The load_balancer effectively selects the medium cluster over the little
cluster (since none of the little CPU is overutilized). The load_balancer
migrates the task b. to a big CPU.
Note:
This is true most of the time, but whenever a non-UCLAMP_MAX tasks wakes-up
on one of CPU0-3 (where the UCLAMP_MAX are pinned), the cluster becomes
overutilized and the new mechanism is bypassed.
Same thing if a task with [UCLAMP_MIN:0, UCLAMP_MAX:1024, duty_cycle=100%, cpuset:0]
is added to the workload.
---
- With patch 5 [RFC PATCH 5/5] sched/fair: Add push task callback for EAS
- Without this patch
The task b. gets an opportunity to migrate to a big CPU through the sched_tick.
However with both patches are applied, the migration is triggered by the
load_balancer.
---
So FWIW, from a mechanism PoV and independently from patch 5:
Tested-by: Pierre Gondois <pierre.gondois@....com>
On 8/30/24 15:03, Vincent Guittot wrote:
> With EAS, a group should be set overloaded if at least 1 CPU in the group
> is overutilized bit it can happen that a CPU is fully utilized by tasks
> because of clamping the compute capacity of the CPU. In such case, the CPU
> is not overutilized and as a result should not be set overloaded as well.
>
> group_overloaded being a higher priority than group_misfit, such group can
> be selected as the busiest group instead of a group with a mistfit task
> and prevents load_balance to select the CPU with the misfit task to pull
> the latter on a fitting CPU.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> ---
> kernel/sched/fair.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index fea057b311f6..e67d6029b269 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9806,6 +9806,7 @@ struct sg_lb_stats {
> enum group_type group_type;
> unsigned int group_asym_packing; /* Tasks should be moved to preferred CPU */
> unsigned int group_smt_balance; /* Task on busy SMT be moved */
> + unsigned long group_overutilized; /* No CPU is overutilized in the group */
> unsigned long group_misfit_task_load; /* A CPU has a task too big for its capacity */
> #ifdef CONFIG_NUMA_BALANCING
> unsigned int nr_numa_running;
> @@ -10039,6 +10040,13 @@ group_has_capacity(unsigned int imbalance_pct, struct sg_lb_stats *sgs)
> static inline bool
> group_is_overloaded(unsigned int imbalance_pct, struct sg_lb_stats *sgs)
> {
> + /*
> + * With EAS and uclamp, 1 CPU in the group must be overutilized to
> + * consider the group overloaded.
> + */
> + if (sched_energy_enabled() && !sgs->group_overutilized)
> + return false;
> +
> if (sgs->sum_nr_running <= sgs->group_weight)
> return false;
>
> @@ -10252,8 +10260,10 @@ static inline void update_sg_lb_stats(struct lb_env *env,
> if (nr_running > 1)
> *sg_overloaded = 1;
>
> - if (cpu_overutilized(i))
> + if (cpu_overutilized(i)) {
> *sg_overutilized = 1;
> + sgs->group_overutilized = 1;
> + }
>
> #ifdef CONFIG_NUMA_BALANCING
> sgs->nr_numa_running += rq->nr_numa_running;
Powered by blists - more mailing lists