linux-kernel - Re: [RFC patch v3 14/20] sched: Introduce update_llc_busiest() to deal with groups having preferred LLC tasks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <736d41f0-1eb4-4420-ab67-e88fc7e31bda@linux.ibm.com>
Date: Fri, 4 Jul 2025 01:22:09 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: Tim Chen <tim.c.chen@...ux.intel.com>, Chen Yu <yu.c.chen@...el.com>
Cc: Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
        Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
        Tim Chen <tim.c.chen@...el.com>,
        Vincent Guittot
 <vincent.guittot@...aro.org>,
        Libo Chen <libo.chen@...cle.com>, Abel Wu <wuyun.abel@...edance.com>,
        Madadi Vineeth Reddy <vineethr@...ux.ibm.com>,
        Hillf Danton <hdanton@...a.com>, Len Brown <len.brown@...el.com>,
        linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        K Prateek Nayak <kprateek.nayak@....com>,
        "Gautham R . Shenoy" <gautham.shenoy@....com>
Subject: Re: [RFC patch v3 14/20] sched: Introduce update_llc_busiest() to
 deal with groups having preferred LLC tasks



On 6/18/25 23:58, Tim Chen wrote:
> The load balancer attempts to identify the busiest sched_group with
> the highest load and migrates some tasks to a less busy sched_group
> to distribute the load across different CPUs.
> 
> When cache-aware scheduling is enabled, the busiest sched_group is
> defined as the one with the highest number of tasks preferring to run
> on the destination LLC. If the busiest group has llc_balance tag,
> the cache aware load balance will be launched.
> 
> Introduce the helper function update_llc_busiest() to identify
> such sched group with most tasks preferring the destination LLC.
> 
> Co-developed-by: Chen Yu <yu.c.chen@...el.com>
> Signed-off-by: Chen Yu <yu.c.chen@...el.com>
> Signed-off-by: Tim Chen <tim.c.chen@...ux.intel.com>
> ---
>   kernel/sched/fair.c | 36 +++++++++++++++++++++++++++++++++++-
>   1 file changed, 35 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 48a090c6e885..ab3d1239d6e4 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10848,12 +10848,36 @@ static inline bool llc_balance(struct lb_env *env, struct sg_lb_stats *sgs,
>   
>   	return false;
>   }
> +
> +static bool update_llc_busiest(struct lb_env *env,
> +			       struct sg_lb_stats *busiest,
> +			       struct sg_lb_stats *sgs)
> +{
> +	int idx;
> +
> +	/* Only the candidate with llc_balance need to be taken care of */
> +	if (!sgs->group_llc_balance)
> +		return false;
> +
> +	/*
> +	 * There are more tasks that want to run on dst_cpu's LLC.
> +	 */
> +	idx = llc_idx(env->dst_cpu);
> +	return sgs->nr_pref_llc[idx] > busiest->nr_pref_llc[idx];
> +}
>   #else
>   static inline bool llc_balance(struct lb_env *env, struct sg_lb_stats *sgs,
>   			       struct sched_group *group)
>   {
>   	return false;
>   }
> +
> +static bool update_llc_busiest(struct lb_env *env,
> +			       struct sg_lb_stats *busiest,
> +			       struct sg_lb_stats *sgs)
> +{
> +	return false;
> +}
>   #endif
>   
>   static inline long sibling_imbalance(struct lb_env *env,
> @@ -11085,6 +11109,14 @@ static bool update_sd_pick_busiest(struct lb_env *env,
>   	     sds->local_stat.group_type != group_has_spare))
>   		return false;
>   
> +	/* deal with prefer LLC load balance, if failed, fall into normal load balance */
> +	if (update_llc_busiest(env, busiest, sgs))
> +		return true;
> +
> +	/* if there is already a busy group, skip the normal load balance */
> +	if (busiest->group_llc_balance)
> +		return false;
> +

If you had a group which was group_overloaded but it could have group_llc_balance right?
In this case the priorities based on group_type is not followed no?

>   	if (sgs->group_type > busiest->group_type)
>   		return true;
>   
> @@ -11991,9 +12023,11 @@ static struct sched_group *sched_balance_find_src_group(struct lb_env *env)
>   	/*
>   	 * Try to move all excess tasks to a sibling domain of the busiest
>   	 * group's child domain.
> +	 * Also do so if we can move some tasks that prefer the local LLC.
>   	 */
>   	if (sds.prefer_sibling && local->group_type == group_has_spare &&
> -	    sibling_imbalance(env, &sds, busiest, local) > 1)
> +	    (busiest->group_llc_balance ||
> +	    sibling_imbalance(env, &sds, busiest, local) > 1))
>   		goto force_balance;
>   
>   	if (busiest->group_type != group_overloaded) {

Also, This load balancing happening due to llc could be very tricky to debug.
Any stats added to schedstat or sched/debug?