lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 15 Mar 2023 15:34:19 +0000
From:   Valentin Schneider <vschneid@...hat.com>
To:     Yicong Yang <yangyicong@...wei.com>, mingo@...hat.com,
        peterz@...radead.org, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, linux-kernel@...r.kernel.org
Cc:     dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        mgorman@...e.de, bristot@...hat.com, linuxarm@...wei.com,
        prime.zeng@...wei.com, wangjie125@...wei.com,
        yangyicong@...ilicon.com
Subject: Re: [PATCH] sched/fair: Don't balance migration disabled tasks

On 13/03/23 14:57, Yicong Yang wrote:
>  kernel/sched/fair.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 7a1b1f855b96..8fe767362d22 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8433,6 +8433,10 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
>       if (kthread_is_per_cpu(p))
>               return 0;
>
> +	/* Migration disabled tasks need to be kept on their running CPU. */
> +	if (is_migration_disabled(p))
> +		return 0;
> +
>       if (!cpumask_test_cpu(env->dst_cpu, p->cpus_ptr)) {
>               int cpu;

That cpumask check should cover migration_disabled tasks, unless they
haven't gone through migrate_disable_switch() yet
(p->migration_disabled == 1, but the cpus_ptr hasn't been touched yet).

Now, if that's the case, the task has to be src_rq's current (since it
hasn't switched out), which means can_migrate_task() should exit via:

        if (task_on_cpu(env->src_rq, p)) {
                schedstat_inc(p->stats.nr_failed_migrations_running);
                return 0;
        }

and thus not try to detach_task(). With that in mind, I don't get how your
splat can happen, nor how the change change can help (a remote task p could
execute migrate_disable() concurrently with can_migrate_task(p)).

I'm a bit confused here, detach_tasks() happens entirely with src_rq
rq_lock held, so there shouldn't be any surprises.

Can you share any extra context? E.g. exact HEAD of your tree, maybe the
migrate_disable task in question if you have that info.

>
> --
> 2.24.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ