[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <925bbda25461035fdec1bebf8487f84f9a3852a7.camel@linux.intel.com>
Date: Tue, 05 Sep 2023 12:30:51 -0700
From: Tim Chen <tim.c.chen@...ux.intel.com>
To: Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>, mingo@...hat.com,
peterz@...radead.org, vincent.guittot@...aro.org
Cc: dietmar.eggemann@....com, vschneid@...hat.com,
linux-kernel@...r.kernel.org, srikar@...ux.vnet.ibm.com,
mgorman@...hsingularity.net, mingo@...nel.org, yu.c.chen@...el.com,
ricardo.neri-calderon@...ux.intel.com, iamjoonsoo.kim@....com,
juri.lelli@...hat.com, rocking@...ux.alibaba.com,
joshdon@...gle.com
Subject: Re: [PATCH] sched/fair: optimize should_we_balance for higher SMT
systems
On Sat, 2023-09-02 at 13:42 +0530, Shrikanth Hegde wrote:
>
>
> Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
> Signed-off-by: Shrikanth Hegde <sshegde@...ux.vnet.ibm.com>
> ---
> kernel/sched/fair.c | 15 ++++++++++++++-
> 1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0b7445cd5af9..6e31923293bb 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6619,6 +6619,7 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> /* Working cpumask for: load_balance, load_balance_newidle. */
> static DEFINE_PER_CPU(cpumask_var_t, load_balance_mask);
> static DEFINE_PER_CPU(cpumask_var_t, select_rq_mask);
> +static DEFINE_PER_CPU(cpumask_var_t, should_we_balance_tmpmask);
>
> #ifdef CONFIG_NO_HZ_COMMON
>
> @@ -10913,6 +10914,7 @@ static int active_load_balance_cpu_stop(void *data);
>
> static int should_we_balance(struct lb_env *env)
> {
> + struct cpumask *swb_cpus = this_cpu_cpumask_var_ptr(should_we_balance_tmpmask);
> struct sched_group *sg = env->sd->groups;
> int cpu, idle_smt = -1;
>
> @@ -10936,8 +10938,9 @@ static int should_we_balance(struct lb_env *env)
> return 1;
> }
>
> + cpumask_copy(swb_cpus, group_balance_mask(sg));
> /* Try to find first idle CPU */
> - for_each_cpu_and(cpu, group_balance_mask(sg), env->cpus) {
> + for_each_cpu_and(cpu, swb_cpus, env->cpus) {
> if (!idle_cpu(cpu))
> continue;
>
> @@ -10949,6 +10952,14 @@ static int should_we_balance(struct lb_env *env)
> if (!(env->sd->flags & SD_SHARE_CPUCAPACITY) && !is_core_idle(cpu)) {
> if (idle_smt == -1)
> idle_smt = cpu;
> + /*
> + * If the core is not idle, and first SMT sibling which is
> + * idle has been found, then its not needed to check other
> + * SMT siblings for idleness
> + */
> +#ifdef CONFIG_SCHED_SMT
> + cpumask_andnot(swb_cpus, swb_cpus, cpu_smt_mask(cpu));
> +#endif
> continue;
> }
>
> @@ -12914,6 +12925,8 @@ __init void init_sched_fair_class(void)
> for_each_possible_cpu(i) {
> zalloc_cpumask_var_node(&per_cpu(load_balance_mask, i), GFP_KERNEL, cpu_to_node(i));
> zalloc_cpumask_var_node(&per_cpu(select_rq_mask, i), GFP_KERNEL, cpu_to_node(i));
> + zalloc_cpumask_var_node(&per_cpu(should_we_balance_tmpmask, i),
> + GFP_KERNEL, cpu_to_node(i));
Shrianth,
Wonder if we can avoid allocating the
should_we_balance_tmpmask for SMT2 case to save memory
for system with large number of cores.
The new mask and logic I think is only needed for more than 2 threads in a core.
Tim
>
> #ifdef CONFIG_CFS_BANDWIDTH
> INIT_CSD(&cpu_rq(i)->cfsb_csd, __cfsb_csd_unthrottle, cpu_rq(i));
> --
> 2.31.1
>
Powered by blists - more mailing lists