linux-kernel - Re: [PATCH] sched: Reduce rq lock contention in load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <3ccd31d4-ecb5-16fa-40c0-f7cc2fb9f9f2@bytedance.com>
Date:   Fri, 16 Dec 2022 15:36:35 +0800
From:   Abel Wu <wuyun.abel@...edance.com>
To:     chenying <chenying.kernel@...edance.com>, mingo@...hat.com,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Benjamin Segall <bsegall@...gle.com>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched: Reduce rq lock contention in load_balance()

On 12/13/22 11:13 AM, chenying wrote:
> [nit]
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index e4a0b8bd941c..aeb4fa9ac93a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10295,6 +10295,7 @@ static int load_balance(int this_cpu, struct rq 
> *this_rq,
>                  goto out_balanced;
>          }
> 
> +refind:
>          busiest = find_busiest_queue(&env, group);
>          if (!busiest) {
>                  schedstat_inc(sd->lb_nobusyq[idle]);
> @@ -10303,6 +10304,14 @@ static int load_balance(int this_cpu, struct rq 
> *this_rq,
> 
>          WARN_ON_ONCE(busiest == env.dst_rq);
> 
> +       if (READ_ONCE(busiest->balancing)) {
> +               __cpumask_clear_cpu(cpu_of(busiest), cpus);
> +               if (cpumask_intersects(sched_group_span(group), cpus))
> +                       goto refind;
> +
> +               goto out_balanced;
> +       }
> +

Here removing the cpu from @cpus will prevent it being selected once
a redo is triggered due to all tasks on the busiest cpu pinned by cpu
affinity. If that is the case, the removed cpu can still be the busiest
but not in balancing at that moment.

IMHO it'd be better skip the in-balancing cpus in find_busiest_queue()
without modifying @cpus to keep consistence among the redos.

Thanks & Best,
	Abel