lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtCujzU0musRE=FxvgR4rky76XnqaF=ak6U5YwgWnZu9KQ@mail.gmail.com>
Date:   Tue, 30 May 2023 16:45:07 +0200
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     Yicong Yang <yangyicong@...wei.com>
Cc:     mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
        dietmar.eggemann@....com, vschneid@...hat.com,
        linux-kernel@...r.kernel.org, rostedt@...dmis.org,
        bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com,
        yu.c.chen@...el.com, linuxarm@...wei.com, prime.zeng@...wei.com,
        wangjie125@...wei.com, yangyicong@...ilicon.com
Subject: Re: [PATCH v2] sched/fair: Don't balance task to its current running CPU

On Tue, 30 May 2023 at 10:26, Yicong Yang <yangyicong@...wei.com> wrote:
>
> From: Yicong Yang <yangyicong@...ilicon.com>
>
> We've run into the case that the balancer tries to balance a migration
> disabled task and trigger the warning in set_task_cpu() like below:
>
>  ------------[ cut here ]------------
>  WARNING: CPU: 7 PID: 0 at kernel/sched/core.c:3115 set_task_cpu+0x188/0x240
>  Modules linked in: hclgevf xt_CHECKSUM ipt_REJECT nf_reject_ipv4 <...snip>
>  CPU: 7 PID: 0 Comm: swapper/7 Kdump: loaded Tainted: G           O       6.1.0-rc4+ #1
>  Hardware name: Huawei TaiShan 2280 V2/BC82AMDC, BIOS 2280-V2 CS V5.B221.01 12/09/2021
>  pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>  pc : set_task_cpu+0x188/0x240
>  lr : load_balance+0x5d0/0xc60
>  sp : ffff80000803bc70
>  x29: ffff80000803bc70 x28: ffff004089e190e8 x27: ffff004089e19040
>  x26: ffff007effcabc38 x25: 0000000000000000 x24: 0000000000000001
>  x23: ffff80000803be84 x22: 000000000000000c x21: ffffb093e79e2a78
>  x20: 000000000000000c x19: ffff004089e19040 x18: 0000000000000000
>  x17: 0000000000001fad x16: 0000000000000030 x15: 0000000000000000
>  x14: 0000000000000003 x13: 0000000000000000 x12: 0000000000000000
>  x11: 0000000000000001 x10: 0000000000000400 x9 : ffffb093e4cee530
>  x8 : 00000000fffffffe x7 : 0000000000ce168a x6 : 000000000000013e
>  x5 : 00000000ffffffe1 x4 : 0000000000000001 x3 : 0000000000000b2a
>  x2 : 0000000000000b2a x1 : ffffb093e6d6c510 x0 : 0000000000000001
>  Call trace:
>   set_task_cpu+0x188/0x240
>   load_balance+0x5d0/0xc60
>   rebalance_domains+0x26c/0x380
>   _nohz_idle_balance.isra.0+0x1e0/0x370
>   run_rebalance_domains+0x6c/0x80
>   __do_softirq+0x128/0x3d8
>   ____do_softirq+0x18/0x24
>   call_on_irq_stack+0x2c/0x38
>   do_softirq_own_stack+0x24/0x3c
>   __irq_exit_rcu+0xcc/0xf4
>   irq_exit_rcu+0x18/0x24
>   el1_interrupt+0x4c/0xe4
>   el1h_64_irq_handler+0x18/0x2c
>   el1h_64_irq+0x74/0x78
>   arch_cpu_idle+0x18/0x4c
>   default_idle_call+0x58/0x194
>   do_idle+0x244/0x2b0
>   cpu_startup_entry+0x30/0x3c
>   secondary_start_kernel+0x14c/0x190
>   __secondary_switched+0xb0/0xb4
>  ---[ end trace 0000000000000000 ]---
>
> Further investigation shows that the warning is superfluous, the migration
> disabled task is just going to be migrated to its current running CPU.
> This is because that on load balance if the dst_cpu is not allowed by the
> task, we'll re-select a new_dst_cpu as a candidate. If no task can be
> balanced to dst_cpu we'll try to balance the task to the new_dst_cpu
> instead. In this case when the migration disabled task is not on CPU it
> only allows to run on its current CPU, load balance will select its
> current CPU as new_dst_cpu and later triggers the warning above.
>
> The new_dst_cpu is chosen from the env->dst_grpmask. Currently it
> contains CPUs in sched_group_span() and if we have overlapped groups it's
> possible to run into this case. This patch makes env->dst_grpmask of
> group_balance_mask() which exclude any CPUs from the busiest group and
> solve the issue. For balancing in a domain with no overlapped groups
> the behaviour keeps same as before.
>
> Suggested-by: Vincent Guittot <vincent.guittot@...aro.org>
> Signed-off-by: Yicong Yang <yangyicong@...ilicon.com>

Reviewed-by: Vincent Guittot <vincent.guittot@...aro.org>

> ---
> Change since v1:
> - Solve the issue by making env->dst_cpumask of group_balance_mask(), per Vincent
> Link: https://lore.kernel.org/all/20230524072018.62204-1-yangyicong@huawei.com/
>
> - Thanks Valentin for the knowledge of migration disable. Previous discussion can
> be found at
> https://lore.kernel.org/all/20230313065759.39698-1-yangyicong@huawei.com/
>
>  kernel/sched/fair.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 373ff5f55884..0128dc9344cc 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10744,7 +10744,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>                 .sd             = sd,
>                 .dst_cpu        = this_cpu,
>                 .dst_rq         = this_rq,
> -               .dst_grpmask    = sched_group_span(sd->groups),
> +               .dst_grpmask    = group_balance_mask(sd->groups),
>                 .idle           = idle,
>                 .loop_break     = SCHED_NR_MIGRATE_BREAK,
>                 .cpus           = cpus,
> --
> 2.24.0
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ