lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 16 Dec 2022 15:36:35 +0800
From:   Abel Wu <wuyun.abel@...edance.com>
To:     chenying <chenying.kernel@...edance.com>, mingo@...hat.com,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Benjamin Segall <bsegall@...gle.com>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched: Reduce rq lock contention in load_balance()

On 12/13/22 11:13 AM, chenying wrote:
> [nit]
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index e4a0b8bd941c..aeb4fa9ac93a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10295,6 +10295,7 @@ static int load_balance(int this_cpu, struct rq 
> *this_rq,
>                  goto out_balanced;
>          }
> 
> +refind:
>          busiest = find_busiest_queue(&env, group);
>          if (!busiest) {
>                  schedstat_inc(sd->lb_nobusyq[idle]);
> @@ -10303,6 +10304,14 @@ static int load_balance(int this_cpu, struct rq 
> *this_rq,
> 
>          WARN_ON_ONCE(busiest == env.dst_rq);
> 
> +       if (READ_ONCE(busiest->balancing)) {
> +               __cpumask_clear_cpu(cpu_of(busiest), cpus);
> +               if (cpumask_intersects(sched_group_span(group), cpus))
> +                       goto refind;
> +
> +               goto out_balanced;
> +       }
> +

Here removing the cpu from @cpus will prevent it being selected once
a redo is triggered due to all tasks on the busiest cpu pinned by cpu
affinity. If that is the case, the removed cpu can still be the busiest
but not in balancing at that moment.

IMHO it'd be better skip the in-balancing cpus in find_busiest_queue()
without modifying @cpus to keep consistence among the redos.

Thanks & Best,
	Abel

Powered by blists - more mailing lists