lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <676b2b4e-2c89-4b80-85a6-29f9a39d1694@amd.com>
Date: Fri, 12 Sep 2025 09:26:58 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Aaron Lu <ziqianlu@...edance.com>, Peter Zijlstra <peterz@...radead.org>
CC: kernel test robot <lkp@...el.com>, Valentin Schneider
	<vschneid@...hat.com>, Ben Segall <bsegall@...gle.com>, Chengming Zhou
	<chengming.zhou@...ux.dev>, Josh Don <joshdon@...gle.com>, Ingo Molnar
	<mingo@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, Xi Wang
	<xii@...gle.com>, <llvm@...ts.linux.dev>, <oe-kbuild-all@...ts.linux.dev>,
	<linux-kernel@...r.kernel.org>, Juri Lelli <juri.lelli@...hat.com>, "Dietmar
 Eggemann" <dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>,
	Mel Gorman <mgorman@...e.de>, Chuyi Zhou <zhouchuyi@...edance.com>, "Jan
 Kiszka" <jan.kiszka@...mens.com>, Florian Bezdeka
	<florian.bezdeka@...mens.com>, Songtang Liu <liusongtang@...edance.com>,
	"Chen Yu" <yu.c.chen@...el.com>, Matteo Martelli
	<matteo.martelli@...ethink.co.uk>, Michal Koutný
	<mkoutny@...e.com>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH update 4/4] sched/fair: Do not balance task to a throttled
 cfs_rq

Hello Aaron,

On 9/12/2025 9:14 AM, Aaron Lu wrote:
> When doing load balance and the target cfs_rq is in throttled hierarchy,
> whether to allow balancing there is a question.
> 
> The good side to allow balancing is: if the target CPU is idle or less
> loaded and the being balanced task is holding some kernel resources,
> then it seems a good idea to balance the task there and let the task get
> the CPU earlier and release kernel resources sooner. The bad part is, if
> the task is not holding any kernel resources, then the balance seems not
> that useful.
> 
> While theoretically it's debatable, a performance test[0] which involves
> 200 cgroups and each cgroup runs hackbench(20 sender, 20 receiver) in
> pipe mode showed a performance degradation on AMD Genoa when allowing
> load balance to throttled cfs_rq. Analysis[1] showed hackbench doesn't
> like task migration across LLC boundary. For this reason, add a check in
> can_migrate_task() to forbid balancing to a cfs_rq that is in throttled
> hierarchy. This reduced task migration a lot and performance restored.
> 
> [0]: https://lore.kernel.org/lkml/20250822110701.GB289@bytedance/
> [1]: https://lore.kernel.org/lkml/20250903101102.GB42@bytedance/
> Signed-off-by: Aaron Lu <ziqianlu@...edance.com>

Thank you for updating the patch. Feel free to include:

Reviewed-by: K Prateek Nayak <kprateek.nayak@....com>

-- 
Thanks and Regards,
Prateek

> ---
> update: fix build error reported by kernel test robot when
> CONFIG_FAIR_GROUP_SCHED is not set.
> 
>  kernel/sched/fair.c | 22 ++++++++++++++++++----
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3dbdfaa697477..18a30ae35441a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5737,6 +5737,11 @@ static inline int throttled_hierarchy(struct cfs_rq *cfs_rq)
>  	return cfs_bandwidth_used() && cfs_rq->throttle_count;
>  }
>  
> +static inline int lb_throttled_hierarchy(struct task_struct *p, int dst_cpu)
> +{
> +	return throttled_hierarchy(task_group(p)->cfs_rq[dst_cpu]);
> +}
> +
>  static inline bool task_is_throttled(struct task_struct *p)
>  {
>  	return cfs_bandwidth_used() && p->throttled;
> @@ -6733,6 +6738,11 @@ static inline int throttled_hierarchy(struct cfs_rq *cfs_rq)
>  	return 0;
>  }
>  
> +static inline int lb_throttled_hierarchy(struct task_struct *p, int dst_cpu)
> +{
> +	return 0;
> +}
> +
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>  void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b, struct cfs_bandwidth *parent) {}
>  static void init_cfs_rq_runtime(struct cfs_rq *cfs_rq) {}
> @@ -9369,14 +9379,18 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
>  	/*
>  	 * We do not migrate tasks that are:
>  	 * 1) delayed dequeued unless we migrate load, or
> -	 * 2) cannot be migrated to this CPU due to cpus_ptr, or
> -	 * 3) running (obviously), or
> -	 * 4) are cache-hot on their current CPU, or
> -	 * 5) are blocked on mutexes (if SCHED_PROXY_EXEC is enabled)
> +	 * 2) target cfs_rq is in throttled hierarchy, or
> +	 * 3) cannot be migrated to this CPU due to cpus_ptr, or
> +	 * 4) running (obviously), or
> +	 * 5) are cache-hot on their current CPU, or
> +	 * 6) are blocked on mutexes (if SCHED_PROXY_EXEC is enabled)
>  	 */
>  	if ((p->se.sched_delayed) && (env->migration_type != migrate_load))
>  		return 0;
>  
> +	if (lb_throttled_hierarchy(p, env->dst_cpu))
> +		return 0;
> +
>  	/*
>  	 * We want to prioritize the migration of eligible tasks.
>  	 * For ineligible tasks we soft-limit them and only allow


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ