[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <736d41f0-1eb4-4420-ab67-e88fc7e31bda@linux.ibm.com>
Date: Fri, 4 Jul 2025 01:22:09 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: Tim Chen <tim.c.chen@...ux.intel.com>, Chen Yu <yu.c.chen@...el.com>
Cc: Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
Tim Chen <tim.c.chen@...el.com>,
Vincent Guittot
<vincent.guittot@...aro.org>,
Libo Chen <libo.chen@...cle.com>, Abel Wu <wuyun.abel@...edance.com>,
Madadi Vineeth Reddy <vineethr@...ux.ibm.com>,
Hillf Danton <hdanton@...a.com>, Len Brown <len.brown@...el.com>,
linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
K Prateek Nayak <kprateek.nayak@....com>,
"Gautham R . Shenoy" <gautham.shenoy@....com>
Subject: Re: [RFC patch v3 14/20] sched: Introduce update_llc_busiest() to
deal with groups having preferred LLC tasks
On 6/18/25 23:58, Tim Chen wrote:
> The load balancer attempts to identify the busiest sched_group with
> the highest load and migrates some tasks to a less busy sched_group
> to distribute the load across different CPUs.
>
> When cache-aware scheduling is enabled, the busiest sched_group is
> defined as the one with the highest number of tasks preferring to run
> on the destination LLC. If the busiest group has llc_balance tag,
> the cache aware load balance will be launched.
>
> Introduce the helper function update_llc_busiest() to identify
> such sched group with most tasks preferring the destination LLC.
>
> Co-developed-by: Chen Yu <yu.c.chen@...el.com>
> Signed-off-by: Chen Yu <yu.c.chen@...el.com>
> Signed-off-by: Tim Chen <tim.c.chen@...ux.intel.com>
> ---
> kernel/sched/fair.c | 36 +++++++++++++++++++++++++++++++++++-
> 1 file changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 48a090c6e885..ab3d1239d6e4 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10848,12 +10848,36 @@ static inline bool llc_balance(struct lb_env *env, struct sg_lb_stats *sgs,
>
> return false;
> }
> +
> +static bool update_llc_busiest(struct lb_env *env,
> + struct sg_lb_stats *busiest,
> + struct sg_lb_stats *sgs)
> +{
> + int idx;
> +
> + /* Only the candidate with llc_balance need to be taken care of */
> + if (!sgs->group_llc_balance)
> + return false;
> +
> + /*
> + * There are more tasks that want to run on dst_cpu's LLC.
> + */
> + idx = llc_idx(env->dst_cpu);
> + return sgs->nr_pref_llc[idx] > busiest->nr_pref_llc[idx];
> +}
> #else
> static inline bool llc_balance(struct lb_env *env, struct sg_lb_stats *sgs,
> struct sched_group *group)
> {
> return false;
> }
> +
> +static bool update_llc_busiest(struct lb_env *env,
> + struct sg_lb_stats *busiest,
> + struct sg_lb_stats *sgs)
> +{
> + return false;
> +}
> #endif
>
> static inline long sibling_imbalance(struct lb_env *env,
> @@ -11085,6 +11109,14 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> sds->local_stat.group_type != group_has_spare))
> return false;
>
> + /* deal with prefer LLC load balance, if failed, fall into normal load balance */
> + if (update_llc_busiest(env, busiest, sgs))
> + return true;
> +
> + /* if there is already a busy group, skip the normal load balance */
> + if (busiest->group_llc_balance)
> + return false;
> +
If you had a group which was group_overloaded but it could have group_llc_balance right?
In this case the priorities based on group_type is not followed no?
> if (sgs->group_type > busiest->group_type)
> return true;
>
> @@ -11991,9 +12023,11 @@ static struct sched_group *sched_balance_find_src_group(struct lb_env *env)
> /*
> * Try to move all excess tasks to a sibling domain of the busiest
> * group's child domain.
> + * Also do so if we can move some tasks that prefer the local LLC.
> */
> if (sds.prefer_sibling && local->group_type == group_has_spare &&
> - sibling_imbalance(env, &sds, busiest, local) > 1)
> + (busiest->group_llc_balance ||
> + sibling_imbalance(env, &sds, busiest, local) > 1))
> goto force_balance;
>
> if (busiest->group_type != group_overloaded) {
Also, This load balancing happening due to llc could be very tricky to debug.
Any stats added to schedstat or sched/debug?
Powered by blists - more mailing lists