linux-kernel - Re: [PATCH] sched/fair: Again ignore percpu threads for imbalance pulls

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20211211111215.GW16608@worktop.programming.kicks-ass.net>
Date:   Sat, 11 Dec 2021 12:12:15 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Yihao Wu <wuyihao@...ux.alibaba.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Shanpei Chen <shanpeic@...ux.alibaba.com>,
        王贇 <yun.wang@...ux.alibaba.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/fair: Again ignore percpu threads for imbalance
 pulls

On Sat, Dec 11, 2021 at 05:48:08PM +0800, Yihao Wu wrote:
> commit 2f5f4cce496e ("sched/fair: Ignore percpu threads for imbalance
> pulls") was meant to fix a performance issue, when load balance tries to
> migrate pinned kernel threads at MC domain level. This was destined to
> fail. After it fails, it further makes wakeup balance at NUMA domain level
> messed up. The most severe case that I noticed and frequently occurs:
>     |sum_nr_running(node1) - sum_nr_running(node2)| > 100
> 
> However the original bugfix failed, because it covers only case 1) below.
>   1) Created by create_kthread
>   2) Created by kernel_thread
> No kthread is assigned to task_struct in case 2 (Please refer to comments
> in free_kthread_struct) so it simply won't work.
> 
> The easist way to cover both cases is to check nr_cpus_allowed, just as
> discussed in the mailing list of the v1 version of the original fix.
> 
> * lmbench3.lat_proc -P 104 fork (2 NUMA, and 26 cores, 2 threads)
> 
>                          w/out patch                 w/ patch
> fork+exit latency            1660 ms                  1520 ms (   8.4%)
> 
> Fixes: 2f5f4cce496e ("sched/fair: Ignore percpu threads for imbalance pulls")
> Signed-off-by: Yihao Wu <wuyihao@...ux.alibaba.com>
> ---
>  kernel/kthread.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 4a4d7092a2d8..cb05d3ff2de4 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -543,11 +543,7 @@ void kthread_set_per_cpu(struct task_struct *k, int cpu)
>  
>  bool kthread_is_per_cpu(struct task_struct *p)
>  {
> -	struct kthread *kthread = __to_kthread(p);
> -	if (!kthread)
> -		return false;
> -
> -	return test_bit(KTHREAD_IS_PER_CPU, &kthread->flags);
> +	return (p->flags & PF_KTHREAD) && p->nr_cpus_allowed == 1;
>  }

NAK, this will break lots of things.