linux-kernel - Re: task_group unthrottling and removal race (was Re: [PATCH] sched/fair: Use rq->lock when checking cfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e2bcfc90-18b6-8da5-517c-023d2242ba97@grsecurity.net>
Date:   Wed, 3 Nov 2021 11:51:12 +0100
From:   Mathias Krause <minipli@...ecurity.net>
To:     Michal Koutný <mkoutny@...e.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Odin Ugedal <odin@...d.al>
Cc:     Kevin Tanguy <kevin.tanguy@...p.ovh.com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: task_group unthrottling and removal race (was Re: [PATCH]
 sched/fair: Use rq->lock when checking cfs_rq list) presence

Heh, sometimes a good night sleep helps unfolding the knot in the head!

Am 03.11.21 um 10:51 schrieb Mathias Krause:
> [snip]
> 
> We tried the below patch which, unfortunately, doesn't fix the issue. So
> there must be something else. :(
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 978460f891a1..afee07e9faf9 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -9506,13 +9506,17 @@ void sched_offline_group(struct task_group *tg)
>  {
>  	unsigned long flags;
> 
> -	/* End participation in shares distribution: */
> -	unregister_fair_sched_group(tg);
> -
> +	/*
> +	 * Unlink first, to avoid walk_tg_tree_from() from finding us
> +	 * (via sched_cfs_period_timer()).
> +	 */
>  	spin_lock_irqsave(&task_group_lock, flags);
>  	list_del_rcu(&tg->list);
>  	list_del_rcu(&tg->siblings);
>  	spin_unlock_irqrestore(&task_group_lock, flags);
> +
> +	/* End participation in shares distribution: */

Adding synchronize_rcu() here will ensure all concurrent RCU "readers"
will have finished what they're doing, so we can unlink safely. That
was, apparently, the missing piece.

> +	unregister_fair_sched_group(tg);
>  }
> 
>  static void sched_change_group(struct task_struct *tsk, int type)
> 

Now, synchronize_rcu() is quite a heavy hammer. So using a RCU callback
should be more appropriate. I'll hack up something and post a proper
patch, if you don't beat me to.

Mathias