linux-kernel - Re: [PATCH 2/2] sched/deadline: Temporary copy static parameters to boosted non-DEADLINE entities

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191113092241.GB29273@localhost.localdomain>
Date:   Wed, 13 Nov 2019 10:22:41 +0100
From:   Juri Lelli <juri.lelli@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     mingo@...hat.com, glenn@...ora.tech, linux-kernel@...r.kernel.org,
        rostedt@...dmis.org, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, tglx@...utronix.de,
        luca.abeni@...tannapisa.it, c.scordino@...dence.eu.com,
        tommaso.cucinotta@...tannapisa.it, bristot@...hat.com
Subject: Re: [PATCH 2/2] sched/deadline: Temporary copy static parameters to
 boosted non-DEADLINE entities

Hi,

On 12/11/19 11:51, Peter Zijlstra wrote:
> On Tue, Nov 12, 2019 at 08:50:56AM +0100, Juri Lelli wrote:
> > Boosted entities (Priority Inheritance) use static DEADLINE parameters
> > of the top priority waiter. However, there might be cases where top
> > waiter could be a non-DEADLINE entity that is currently boosted by a
> > DEADLINE entity from a different lock chain (i.e., nested priority
> > chains involving entities of non-DEADLINE classes). In this case, top
> > waiter static DEADLINE parameters could null (initialized to 0 at
> > fork()) and replenish_dl_entity() would hit a BUG().
> 
> Argh!
> 
> > Fix this by temporarily copying static DEADLINE parameters of top
> > DEADLINE waiter (there must be at least one in the chain(s) for the
> > problem above to happen) into boosted entities. Parameters are reset
> > during deboost.
> 
> Also, yuck!

Indeed. :-(

> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4441,19 +4441,21 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
> >  		if (!dl_prio(p->normal_prio) ||
> >  		    (pi_task && dl_entity_preempt(&pi_task->dl, &p->dl))) {
> >  			p->dl.dl_boosted = 1;
> > +			if (!dl_prio(p->normal_prio))
> > +				__dl_copy_static(p, pi_task);
> >  			queue_flag |= ENQUEUE_REPLENISH;
> >  		} else
> >  			p->dl.dl_boosted = 0;
> >  		p->sched_class = &dl_sched_class;
> 
> So I thought our basic approach was deadline inheritance and screw
> runtime accounting.
> 
> Given that, I don't quite understand the REPLENISH hack there. Should we
> not simply copy dl->deadline around (and restore on unboost)?
> 
> That is, should we not do something 'simple' like this:
> 
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 84b26d38c929..1579c571cb83 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -522,6 +522,7 @@ struct sched_dl_entity {
>  	 */
>  	s64				runtime;	/* Remaining runtime for this instance	*/
>  	u64				deadline;	/* Absolute deadline for this instance	*/
> +	u64				normal_deadline;
>  	unsigned int			flags;		/* Specifying the scheduler behaviour	*/
>  
>  	/*
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 26e4ffa01e7a..16164b0ba80b 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4452,9 +4452,11 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
>  		if (!dl_prio(p->normal_prio) ||
>  		    (pi_task && dl_entity_preempt(&pi_task->dl, &p->dl))) {
>  			p->dl.dl_boosted = 1;
> -			queue_flag |= ENQUEUE_REPLENISH;
> -		} else
> +			p->dl.deadline = pi_task->dl.deadline;
> +		} else {
>  			p->dl.dl_boosted = 0;
> +			p->dl.deadline = p->dl.normal_deadline;
> +		}
>  		p->sched_class = &dl_sched_class;
>  	} else if (rt_prio(prio)) {
>  		if (dl_prio(oldprio))
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 43323f875cb9..0ad7c2797f11 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -674,6 +674,7 @@ static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se)
>  	 * spent on hardirq context, etc.).
>  	 */
>  	dl_se->deadline = rq_clock(rq) + dl_se->dl_deadline;
> +	dl_se->normal_deadline = dl_se->deadline;
>  	dl_se->runtime = dl_se->dl_runtime;
>  }
>  
> @@ -709,6 +710,7 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se,
>  	 */
>  	if (dl_se->dl_deadline == 0) {
>  		dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline;
> +		dl_se->normal_deadline = dl_se->deadline;
>  		dl_se->runtime = pi_se->dl_runtime;
>  	}
>  
> @@ -723,6 +725,7 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se,
>  	 */
>  	while (dl_se->runtime <= 0) {
>  		dl_se->deadline += pi_se->dl_period;
> +		dl_se->normal_deadline = dl_se->normal;
>  		dl_se->runtime += pi_se->dl_runtime;

So, the problem is more related to pi_se->dl_runtime than its deadline.
Even if we don't replenish at the instant in time when boosting happens,
the boosted task might still deplete its runtime while being boosted and
that would cause update_curr_dl() to eventually call
enqueue_task_dl(..., ENQUEUE_REPLENISH) - we don't perform runtime
enforcement on boosted tasks, but still do accounting and 'instant'
replenishment with deadline postponement ('soft CBS'). This in turn will
BUG_ON(pi_se->dl_runtime <= 0), as, in a case like Glenn's, N2 and N1
are non-deadline tasks and N1 would be using N2's (pi_se) dl_runtime to
replenish finding it to be 0.

Does it make any sense?