[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18aa730a-01c5-48d8-9f08-44f4dfca4808@arm.com>
Date: Mon, 1 Dec 2025 13:31:43 +0000
From: Christian Loehle <christian.loehle@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
vschneid@...hat.com, linux-kernel@...r.kernel.org, pierre.gondois@....com,
kprateek.nayak@....com
Cc: qyousef@...alina.io, hongyan.xia2@....com, luis.machado@....com
Subject: Re: [PATCH 0/6 v7] sched/fair: Add push task mecansim and hadle more
EAS cases
On 12/1/25 09:13, Vincent Guittot wrote:
> This is a subset of [1] (sched/fair: Rework EAS to handle more cases)
>
> [1] https://lore.kernel.org/all/20250314163614.1356125-1-vincent.guittot@linaro.org/
>
> The current Energy Aware Scheduler has some known limitations which have
> became more and more visible with features like uclamp as an example. This
> serie tries to fix some of those issues:
> - tasks stacked on the same CPU of a PD
> - tasks stuck on the wrong CPU.
>
> Patch 1 fixes the case where a CPU is wrongly classified as overloaded
> whereas it is capped to a lower compute capacity. This wrong classification
> can prevent periodic load balancer to select a group_misfit_task CPU
> because group_overloaded has higher priority.
>
> Patch 2 removes the need of testing uclamp_min in cpu_overutilized to
> trigger the active migration of a task on another CPU.
>
> Patch 3 prepares select_task_rq_fair() to be called without TTWU, Fork or
> Exec flags when we just want to look for a possible better CPU.
>
> Patch 4 adds push call back mecanism to fair scheduler but doesn't enable
> it.
>
> Patch 5 enable has_idle_core for !SMP system to track if there may be an
> idle CPU in the LLC.
>
> Patch 6 adds some conditions to enable pushing runnable tasks for EAS:
> - when a task is stuck on a CPU and the system is not overutilized.
> - if there is a possible idle CPU when the system is overutilized.
>
> More tests results will come later as I wanted to send the pachtset before
> LPC.
>
> Tbench on dragonboard rb5
> schedutil and EAS enabled
>
> # process tip +patchset
> 1 29.1(+/-4.1%) 124.7(+/-12.3%) +329%
> 2 60.0(+/-0.9%) 216.1(+/- 7.9%) +260%
> 4 255.8(+/-1.9%) 421.4(+/- 2.0%) +65%
> 8 1317.3(+/-4.6%) 1396.1(+/- 3.0%) +6%
> 16 958.2(+/-4.6%) 979.6(+/- 2.0%) +2%
Just so I understand, there's no uclamp in the workload here?
Could you expand on the workload a little, what were the parameters/settings?
So the significant increase is really only for nr_proc < nr_cpus, with the
observed throughput increase it'll probably be something like "always running
on little CPUs" vs "always running on big CPUs", is that what's happening?
Also shouldn't tbench still have plenty of wakeup events? It issues plenty of
TCP anyway.
>
> Hackbench didn't show any difference
>
>
> Vincent Guittot (6):
> sched/fair: Filter false overloaded_group case for EAS
> sched/fair: Update overutilized detection
> sched/fair: Prepare select_task_rq_fair() to be called for new cases
> sched/fair: Add push task mechanism for fair
> sched/fair: Enable idle core tracking for !SMT
> sched/fair: Add EAS and idle cpu push trigger
>
> kernel/sched/fair.c | 350 +++++++++++++++++++++++++++++++++++-----
> kernel/sched/sched.h | 46 ++++--
> kernel/sched/topology.c | 3 +
> 3 files changed, 346 insertions(+), 53 deletions(-)
>
Powered by blists - more mailing lists