linux-kernel - Re: [PATCH 1/4] sched/fair: make sure to try to detach at least one movable task

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABk29NsQf_xStzWg8bB_hpNpPC_LduMs-M058LjdhnDG16wN_A@mail.gmail.com>
Date: Mon, 12 Feb 2024 12:28:55 -0800
From: Josh Don <joshdon@...gle.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com, 
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com, 
	mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com, 
	linux-kernel@...r.kernel.org, zhangqiao22@...wei.com
Subject: Re: [PATCH 1/4] sched/fair: make sure to try to detach at least one
 movable task

Hi Vincent,

On Thu, Aug 25, 2022 at 5:27 AM Vincent Guittot
<vincent.guittot@...aro.org> wrote:
>
> During load balance, we try at most env->loop_max time to move a task.
> But it can happen that the loop_max LRU tasks (ie tail of
> the cfs_tasks list) can't be moved to dst_cpu because of affinity.
> In this case, loop in the list until we found at least one.

We had a user recently trigger a hard lockup which we believe is due
to this patch. The user in question had O(10k) threads affinitized to
a cpu; seems like the process had an out of control thread spawning
issue, and was in the middle of getting killed. However, that was
being slowed down due to the fact that load balance was iterating all
these threads and bouncing the rq lock (and making no progress due to
ALL_PINNED). Before this patch, load balance would quit after hitting
loop_max.

Even ignoring that specific instance, it seems pretty easy for this
patch to cause a softlockup due to a buggy or malicious process.

For the tradeoff you were trying to make in this patch (spend more
time searching in the hopes that there's something migratable further
in the list), perhaps it would be better to adjust
sysctl.sched_nr_migrate instead of baking this into the kernel?

Best,
Josh

>
> The maximum of detached tasks remained the same as before.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@...aro.org>
> ---
>  kernel/sched/fair.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index da388657d5ac..02b7b808e186 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8052,8 +8052,12 @@ static int detach_tasks(struct lb_env *env)
>                 p = list_last_entry(tasks, struct task_struct, se.group_node);
>
>                 env->loop++;
> -               /* We've more or less seen every task there is, call it quits */
> -               if (env->loop > env->loop_max)
> +               /*
> +                * We've more or less seen every task there is, call it quits
> +                * unless we haven't found any movable task yet.
> +                */
> +               if (env->loop > env->loop_max &&
> +                   !(env->flags & LBF_ALL_PINNED))
>                         break;
>
>                 /* take a breather every nr_migrate tasks */
> @@ -10182,7 +10186,9 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>
>                 if (env.flags & LBF_NEED_BREAK) {
>                         env.flags &= ~LBF_NEED_BREAK;
> -                       goto more_balance;
> +                       /* Stop if we tried all running tasks */
> +                       if (env.loop < busiest->nr_running)
> +                               goto more_balance;
>                 }
>
>                 /*
> --
> 2.17.1
>