linux-kernel - Re: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160211134018.6b15fd68@utopia>
Date:	Thu, 11 Feb 2016 13:40:18 +0100
From:	luca abeni <luca.abeni@...tn.it>
To:	Juri Lelli <juri.lelli@....com>
Cc:	Steven Rostedt <rostedt@...dmis.org>, linux-kernel@...r.kernel.org,
	peterz@...radead.org, mingo@...hat.com, vincent.guittot@...aro.org,
	wanpeng.li@...mail.com
Subject: Re: [PATCH 1/2] sched/deadline: add per rq tracking of admitted
 bandwidth

On Thu, 11 Feb 2016 12:27:54 +0000
Juri Lelli <juri.lelli@....com> wrote:

> On 11/02/16 13:22, Luca Abeni wrote:
> > Hi Juri,
> > 
> > On Thu, 11 Feb 2016 12:12:57 +0000
> > Juri Lelli <juri.lelli@....com> wrote:
> > [...]
> > > I think we still have (at least) two problems:
> > > 
> > >  - select_task_rq_dl, if we select a different target
> > >  - select_task_rq might make use of select_fallback_rq, if
> > > cpus_allowed changed after the task went to sleep
> > > 
> > > Second case is what creates the problem here, as we don't update
> > > task_rq(p) and fallback_cpu ac_bw. I was thinking we might do so,
> > > maybe adding fallback_cpu in task_struct, from
> > > migrate_task_rq_dl() (it has to be added yes), but I fear that we
> > > should hold both rq locks :/.
> > > 
> > > Luca, did you already face this problem (if I got it right) and
> > > thought of a way to fix it? I'll go back and stare a bit more at
> > > those paths.
> > In my patch I took care of the first case (modifying
> > select_task_rq_dl() to move the utilization from the "old rq" to the
> > "new rq"), but I never managed to trigger select_fallback_rq() in my
> > tests, so I overlooked that case.
> > 
> 
> Right, I was thinking to do the same. And you did that after grabbing
> both locks, right?

Not sure if I did everything correctly, but my code in
select_task_rq_dl() currently looks like this (you can obviously
ignore the "migrate_active" and "*_running_bw()" parts, and focus on
the "*_rq_bw()" stuff):
[...]
        if (rq != cpu_rq(cpu)) {
                int migrate_active;

                raw_spin_lock(&rq->lock);
                migrate_active = hrtimer_active(&p->dl.inactive_timer);
                if (migrate_active) {
                        hrtimer_try_to_cancel(&p->dl.inactive_timer);
                        sub_running_bw(&p->dl, &rq->dl);
                }
                sub_rq_bw(&p->dl, &rq->dl);
                raw_spin_unlock(&rq->lock);
                rq = cpu_rq(cpu);
                raw_spin_lock(&rq->lock);
                add_rq_bw(&p->dl, &rq->dl);
                if (migrate_active)
                        add_running_bw(&p->dl, &rq->dl);
                raw_spin_unlock(&rq->lock);
        }
[...]

lockdep is not screaming, and I am not able to trigger any race
condition or strange behaviour (I am currently at more than 24h of
continuous stress-testing, but maybe my testcase is not so good in
finding races here :)



				Luca