[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87k0g48kyp.mognet@arm.com>
Date: Thu, 16 Dec 2021 18:26:38 +0000
From: Valentin Schneider <valentin.schneider@....com>
To: Yihao Wu <wuyihao@...ux.alibaba.com>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Shanpei Chen <shanpeic@...ux.alibaba.com>,
王贇 <yun.wang@...ux.alibaba.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched/fair: Again ignore percpu threads for imbalance pulls
On 11/12/21 17:48, Yihao Wu wrote:
> commit 2f5f4cce496e ("sched/fair: Ignore percpu threads for imbalance
> pulls") was meant to fix a performance issue, when load balance tries to
> migrate pinned kernel threads at MC domain level. This was destined to
> fail.
> After it fails, it further makes wakeup balance at NUMA domain level
> messed up. The most severe case that I noticed and frequently occurs:
> |sum_nr_running(node1) - sum_nr_running(node2)| > 100
>
Wakeup balance (aka find_idlest_cpu()) is different from periodic load
balance (aka load_balance()) and doesn't use can_migrate_task(), so the
incriminated commit shouldn't have impacted it (at least not in obvious
ways...). Do you have any more details on that issue?
> However the original bugfix failed, because it covers only case 1) below.
> 1) Created by create_kthread
> 2) Created by kernel_thread
> No kthread is assigned to task_struct in case 2 (Please refer to comments
> in free_kthread_struct) so it simply won't work.
>
> The easist way to cover both cases is to check nr_cpus_allowed, just as
> discussed in the mailing list of the v1 version of the original fix.
>
> * lmbench3.lat_proc -P 104 fork (2 NUMA, and 26 cores, 2 threads)
>
Reasoning about "proper" pcpu kthreads was simpler since they are static,
see 3a7956e25e1d ("kthread: Fix PF_KTHREAD vs to_kthread() race")
> w/out patch w/ patch
> fork+exit latency 1660 ms 1520 ms ( 8.4%)
>
> Fixes: 2f5f4cce496e ("sched/fair: Ignore percpu threads for imbalance pulls")
> Signed-off-by: Yihao Wu <wuyihao@...ux.alibaba.com>
> ---
> kernel/kthread.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 4a4d7092a2d8..cb05d3ff2de4 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -543,11 +543,7 @@ void kthread_set_per_cpu(struct task_struct *k, int cpu)
>
> bool kthread_is_per_cpu(struct task_struct *p)
> {
> - struct kthread *kthread = __to_kthread(p);
> - if (!kthread)
> - return false;
> -
> - return test_bit(KTHREAD_IS_PER_CPU, &kthread->flags);
> + return (p->flags & PF_KTHREAD) && p->nr_cpus_allowed == 1;
> }
As Peter said, this is going to cause issues. If you look at
kthread_set_per_cpu(), we also store a CPU value which we expect to be
valid when kthread_is_per_cpu(), which that change is breaking.
AIUI what you want to patch is the actual usage in can_migrate_task()
>
> /**
> --
> 2.32.0.604.gb1f3e1269
Powered by blists - more mailing lists