linux-kernel - Re: [RFC v2 3/7] Improve the tracking of active utilisation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160405195657.586e8c97@utopia>
Date:	Tue, 5 Apr 2016 19:56:57 +0200
From:	luca abeni <luca.abeni@...tn.it>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
	Juri Lelli <juri.lelli@....com>
Subject: Re: [RFC v2 3/7] Improve the tracking of active utilisation

On Tue, 5 Apr 2016 17:00:36 +0200
Peter Zijlstra <peterz@...radead.org> wrote:

> On Fri, Apr 01, 2016 at 05:12:29PM +0200, Luca Abeni wrote:
> > +static void task_go_inactive(struct task_struct *p)
> > +{
> > +	struct sched_dl_entity *dl_se = &p->dl;
> > +	struct hrtimer *timer = &dl_se->inactive_timer;
> > +	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
> > +	struct rq *rq = rq_of_dl_rq(dl_rq);
> > +	ktime_t now, act;
> > +	s64 delta;
> > +	u64 zerolag_time;
> > +
> > +	WARN_ON(dl_se->dl_runtime == 0);
> > +
> > +	/* If the inactive timer is already armed, return immediately */
> > +	if (hrtimer_active(&dl_se->inactive_timer))
> > +		return;
> 
> So while we start the timer on the local cpu, we don't migrate the timer
> when we migrate the task, so the callback can happen on a remote cpu,
> right?
> 
> Therefore, the timer function might still be running, but just have done
> task_rq_unlock(), which would have allowed our cpu to acquire the
> rq->lock and get here.
> 
> Then the above check is true, we'll quit, but effectively the inactive
> timer will not run 'again'.
Uhm... So the problem is:
- Task T wakes up, but cannot cancel its inactive timer, because it is running
	+ This should not be a problem: inactive_task_timer() will return without
          doing anything
- Before inactive_task_timer() can actually run, task T migrates to a different CPU
- Befere the timer finishes to run, the task blocks again... So, task_go_inactive()
  sees the timer as active and returns immediately. But the timer has already
  executed (without doing anything). So noone decreases the rq utilisation.

I did not think about this issue, and I never managed to trigger it in my
tests... I'll try to see how it can be addressed. Do you have any suggestions?

[...]
> > @@ -1071,6 +1164,23 @@ select_task_rq_dl(struct task_struct *p, int cpu, int sd_flag, int flags)
> >  	}
> >  	rcu_read_unlock();
> >  
> > +	if (rq != cpu_rq(cpu)) {
> 
> I don't think this is right, you want:
> 
> 	if (task_cpu(p) != cpu) {
> 
> because @cpu does not need to be task_cpu().
Uhm... I must have misunderstood something in the code, then :(
What I want to do here is to check if select_task_rq_dl() selected
a new CPU for this task... Since at the beginning of the function
rq is set as
	rq = cpu_rq(cpu);
I was thinkint about checking if this is still true (if not, it
means that the value of "cpu" changed).

I'll look at it again.


> 
> > +		int migrate_active;
> > +
> > +		raw_spin_lock(&rq->lock);
> 
> Which then also means @rq is 'wrong', so you'll have to add:
> 
> 		rq = task_rq(p);
Ok; I completely misunderstood the current code, then... :(


> 
> before this.
> 
> > +		migrate_active = hrtimer_active(&p->dl.inactive_timer);
> > +		if (migrate_active)
> > +			sub_running_bw(&p->dl, &rq->dl);
> > +		raw_spin_unlock(&rq->lock);
> 
> At this point task_rq() is still the above rq, so if the inactive timer
> hits here it will lock this rq and subtract the running bw here _again_,
> right?
I think it will see the task state as TASK_RUNNING, so it will do nothing.
Or it will cancelled later when the task is enqueued... I'll double check this.



			Thanks,
				Luca

> 
> > +		if (migrate_active) {
> > +			rq = cpu_rq(cpu);
> > +			raw_spin_lock(&rq->lock);
> > +			add_running_bw(&p->dl, &rq->dl);
> > +			raw_spin_unlock(&rq->lock);
> > +		}
> > +	}