[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <676b2b4e-2c89-4b80-85a6-29f9a39d1694@amd.com>
Date: Fri, 12 Sep 2025 09:26:58 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Aaron Lu <ziqianlu@...edance.com>, Peter Zijlstra <peterz@...radead.org>
CC: kernel test robot <lkp@...el.com>, Valentin Schneider
<vschneid@...hat.com>, Ben Segall <bsegall@...gle.com>, Chengming Zhou
<chengming.zhou@...ux.dev>, Josh Don <joshdon@...gle.com>, Ingo Molnar
<mingo@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, Xi Wang
<xii@...gle.com>, <llvm@...ts.linux.dev>, <oe-kbuild-all@...ts.linux.dev>,
<linux-kernel@...r.kernel.org>, Juri Lelli <juri.lelli@...hat.com>, "Dietmar
Eggemann" <dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>,
Mel Gorman <mgorman@...e.de>, Chuyi Zhou <zhouchuyi@...edance.com>, "Jan
Kiszka" <jan.kiszka@...mens.com>, Florian Bezdeka
<florian.bezdeka@...mens.com>, Songtang Liu <liusongtang@...edance.com>,
"Chen Yu" <yu.c.chen@...el.com>, Matteo Martelli
<matteo.martelli@...ethink.co.uk>, Michal Koutný
<mkoutny@...e.com>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH update 4/4] sched/fair: Do not balance task to a throttled
cfs_rq
Hello Aaron,
On 9/12/2025 9:14 AM, Aaron Lu wrote:
> When doing load balance and the target cfs_rq is in throttled hierarchy,
> whether to allow balancing there is a question.
>
> The good side to allow balancing is: if the target CPU is idle or less
> loaded and the being balanced task is holding some kernel resources,
> then it seems a good idea to balance the task there and let the task get
> the CPU earlier and release kernel resources sooner. The bad part is, if
> the task is not holding any kernel resources, then the balance seems not
> that useful.
>
> While theoretically it's debatable, a performance test[0] which involves
> 200 cgroups and each cgroup runs hackbench(20 sender, 20 receiver) in
> pipe mode showed a performance degradation on AMD Genoa when allowing
> load balance to throttled cfs_rq. Analysis[1] showed hackbench doesn't
> like task migration across LLC boundary. For this reason, add a check in
> can_migrate_task() to forbid balancing to a cfs_rq that is in throttled
> hierarchy. This reduced task migration a lot and performance restored.
>
> [0]: https://lore.kernel.org/lkml/20250822110701.GB289@bytedance/
> [1]: https://lore.kernel.org/lkml/20250903101102.GB42@bytedance/
> Signed-off-by: Aaron Lu <ziqianlu@...edance.com>
Thank you for updating the patch. Feel free to include:
Reviewed-by: K Prateek Nayak <kprateek.nayak@....com>
--
Thanks and Regards,
Prateek
> ---
> update: fix build error reported by kernel test robot when
> CONFIG_FAIR_GROUP_SCHED is not set.
>
> kernel/sched/fair.c | 22 ++++++++++++++++++----
> 1 file changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3dbdfaa697477..18a30ae35441a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5737,6 +5737,11 @@ static inline int throttled_hierarchy(struct cfs_rq *cfs_rq)
> return cfs_bandwidth_used() && cfs_rq->throttle_count;
> }
>
> +static inline int lb_throttled_hierarchy(struct task_struct *p, int dst_cpu)
> +{
> + return throttled_hierarchy(task_group(p)->cfs_rq[dst_cpu]);
> +}
> +
> static inline bool task_is_throttled(struct task_struct *p)
> {
> return cfs_bandwidth_used() && p->throttled;
> @@ -6733,6 +6738,11 @@ static inline int throttled_hierarchy(struct cfs_rq *cfs_rq)
> return 0;
> }
>
> +static inline int lb_throttled_hierarchy(struct task_struct *p, int dst_cpu)
> +{
> + return 0;
> +}
> +
> #ifdef CONFIG_FAIR_GROUP_SCHED
> void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b, struct cfs_bandwidth *parent) {}
> static void init_cfs_rq_runtime(struct cfs_rq *cfs_rq) {}
> @@ -9369,14 +9379,18 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
> /*
> * We do not migrate tasks that are:
> * 1) delayed dequeued unless we migrate load, or
> - * 2) cannot be migrated to this CPU due to cpus_ptr, or
> - * 3) running (obviously), or
> - * 4) are cache-hot on their current CPU, or
> - * 5) are blocked on mutexes (if SCHED_PROXY_EXEC is enabled)
> + * 2) target cfs_rq is in throttled hierarchy, or
> + * 3) cannot be migrated to this CPU due to cpus_ptr, or
> + * 4) running (obviously), or
> + * 5) are cache-hot on their current CPU, or
> + * 6) are blocked on mutexes (if SCHED_PROXY_EXEC is enabled)
> */
> if ((p->se.sched_delayed) && (env->migration_type != migrate_load))
> return 0;
>
> + if (lb_throttled_hierarchy(p, env->dst_cpu))
> + return 0;
> +
> /*
> * We want to prioritize the migration of eligible tasks.
> * For ineligible tasks we soft-limit them and only allow
Powered by blists - more mailing lists