[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211211111215.GW16608@worktop.programming.kicks-ass.net>
Date: Sat, 11 Dec 2021 12:12:15 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Yihao Wu <wuyihao@...ux.alibaba.com>
Cc: Ingo Molnar <mingo@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Shanpei Chen <shanpeic@...ux.alibaba.com>,
王贇 <yun.wang@...ux.alibaba.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/fair: Again ignore percpu threads for imbalance
pulls
On Sat, Dec 11, 2021 at 05:48:08PM +0800, Yihao Wu wrote:
> commit 2f5f4cce496e ("sched/fair: Ignore percpu threads for imbalance
> pulls") was meant to fix a performance issue, when load balance tries to
> migrate pinned kernel threads at MC domain level. This was destined to
> fail. After it fails, it further makes wakeup balance at NUMA domain level
> messed up. The most severe case that I noticed and frequently occurs:
> |sum_nr_running(node1) - sum_nr_running(node2)| > 100
>
> However the original bugfix failed, because it covers only case 1) below.
> 1) Created by create_kthread
> 2) Created by kernel_thread
> No kthread is assigned to task_struct in case 2 (Please refer to comments
> in free_kthread_struct) so it simply won't work.
>
> The easist way to cover both cases is to check nr_cpus_allowed, just as
> discussed in the mailing list of the v1 version of the original fix.
>
> * lmbench3.lat_proc -P 104 fork (2 NUMA, and 26 cores, 2 threads)
>
> w/out patch w/ patch
> fork+exit latency 1660 ms 1520 ms ( 8.4%)
>
> Fixes: 2f5f4cce496e ("sched/fair: Ignore percpu threads for imbalance pulls")
> Signed-off-by: Yihao Wu <wuyihao@...ux.alibaba.com>
> ---
> kernel/kthread.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 4a4d7092a2d8..cb05d3ff2de4 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -543,11 +543,7 @@ void kthread_set_per_cpu(struct task_struct *k, int cpu)
>
> bool kthread_is_per_cpu(struct task_struct *p)
> {
> - struct kthread *kthread = __to_kthread(p);
> - if (!kthread)
> - return false;
> -
> - return test_bit(KTHREAD_IS_PER_CPU, &kthread->flags);
> + return (p->flags & PF_KTHREAD) && p->nr_cpus_allowed == 1;
> }
NAK, this will break lots of things.
Powered by blists - more mailing lists