linux-kernel - Re: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160210134240.GB11415@e106622-lin>
Date:	Wed, 10 Feb 2016 13:42:40 +0000
From:	Juri Lelli <juri.lelli@....com>
To:	luca abeni <luca.abeni@...tn.it>
Cc:	rostedt@...dmis.org, linux-kernel@...r.kernel.org,
	peterz@...radead.org, mingo@...hat.com, vincent.guittot@...aro.org,
	wanpeng.li@...mail.com
Subject: Re: [PATCH 1/2] sched/deadline: add per rq tracking of admitted
 bandwidth

On 10/02/16 13:48, Luca Abeni wrote:
> Hi,
> 
> On Wed, 10 Feb 2016 11:32:58 +0000
> Juri Lelli <juri.lelli@....com> wrote:
> [...]
> > From 62f70ca3051672dce209e8355cf5eddc9d825c2a Mon Sep 17 00:00:00 2001
> > From: Juri Lelli <juri.lelli@....com>
> > Date: Sat, 6 Feb 2016 12:41:09 +0000
> > Subject: [PATCH 1/2] sched/deadline: add per rq tracking of admitted
> > bandwidth
> > 
> > Currently SCHED_DEADLINE scheduling policy tracks bandwidth of tasks
> > that passed admission control at root_domain level only. This creates
> > problems when such data structure(s) are destroyed, when we
> > reconfigure scheduling domains for example.
> > 
> > This is part one of two changes required to fix the problem. In this
> > patch we add per-rq tracking of admitted bandwidth. Tasks bring with
> > them their bandwidth contribution when they enter the system and are
> > enqueued for the first time. Contributions are then moved around when
> > migrations happen and removed when tasks die.
> 
> I think this patch actually does two different things (addressing two
> separate problems):
> 1) it introduces the tracking of per-rq utilization (used in your next
>    patch to address the root domain issues)
> 2) it fixes a bug in the current utilization tracking mechanism.
>    Currently, a task doing
> 	while(1) {
> 		switch to SCHED_DEADLINE
> 		switch to SCHED_OTHER
> 	}
>    brings dl_b->total_bw below 0. Thanks to Juri for showing me this
>    problem (and how to reproduce it) in a private email.
>    This happens because when the task switches back from SCHED_DEADLINE
>    to SCHED_OTHER, switched_from_dl() does not clear its deadline
>    parameters (they will be cleared by the deadline timer when it
>    fires). But dl_overflow() removes its utilization from
>    dl_b->total_bw. When the task switches back to SCHED_DEADLINE, the
>    if (new_bw == p->dl.dl_bw) check in dl_overflow() prevents
>    __dl_add() from being called, and so when the task switches back to
>    SCHED_OTHER dl_b->total_bw becomes negative.
> 

Yep. That is also the reason why I think, whatever refactoring we are
goind to do, we should keep per-rq and root_domain accounting done
together.

> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 9503d59..0ee0ec2 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2432,7 +2432,7 @@ static int dl_overflow(struct task_struct *p,
> > int policy, u64 new_bw = dl_policy(policy) ? to_ratio(period,
> > runtime) : 0; int cpus, err = -1;
> >  
> > -	if (new_bw == p->dl.dl_bw)
> > +	if (task_has_dl_policy(p) && new_bw == p->dl.dl_bw)
> >  		return 0;
> 
> This hunk actually fixes issue 2) mentioned above, so I think it should
> be committed in a short time (independently from the rest of the
> patch). And maybe is a good candidate for backporting to stable kernels?
> 

Yes, this is a sensible fix per se. I can split it and send it
separately.

Best,

- Juri