lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 8 Jul 2014 14:50:50 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	bsegall@...gle.com
Cc:	Yuyang Du <yuyang.du@...el.com>, mingo@...hat.com,
	linux-kernel@...r.kernel.org, rafael.j.wysocki@...el.com,
	arjan.van.de.ven@...el.com, len.brown@...el.com,
	alan.cox@...el.com, mark.gross@...el.com, pjt@...gle.com,
	fengguang.wu@...el.com
Subject: Re: [PATCH 2/2] sched: Rewrite per entity runnable load average
 tracking

On Mon, Jul 07, 2014 at 03:25:07PM -0700, bsegall@...gle.com wrote:
> >> +static inline void enqueue_entity_load_avg(struct sched_entity *se)
> >>  {
> >> +	struct sched_avg *sa = &se->avg;
> >> +	struct cfs_rq *cfs_rq = cfs_rq_of(se);
> >> +	u64 now = cfs_rq_clock_task(cfs_rq);
> >> +	u32 old_load_avg = cfs_rq->avg.load_avg;
> >> +	int migrated = 0;
> >>  
> >> +	if (entity_is_task(se)) {
> >> +		if (sa->last_update_time == 0) {
> >> +			sa->last_update_time = now;
> >> +			migrated = 1;
> >>  		}
> >> +		else
> >> +			__update_load_avg(now, sa, se->on_rq * se->load.weight);
> >>  	}
> >>  
> >> +	__update_load_avg(now, &cfs_rq->avg, cfs_rq->load.weight);
> >>  
> >> +	if (migrated)
> >> +		cfs_rq->avg.load_avg += sa->load_avg;
> >>  
> >> +	synchronize_tg_load_avg(cfs_rq, old_load_avg);
> >>  }
> >
> > So here you add the task to the cfs_rq avg when its got migrate in,
> > however:
> >
> >> @@ -4552,17 +4326,9 @@ migrate_task_rq_fair(struct task_struct *p, int next_cpu)
> >>  	struct sched_entity *se = &p->se;
> >>  	struct cfs_rq *cfs_rq = cfs_rq_of(se);
> >>  
> >> +	/* Update task on old CPU, then ready to go (entity must be off the queue) */
> >> +	__update_load_avg(cfs_rq_clock_task(cfs_rq), &se->avg, 0);
> >> +	se->avg.last_update_time = 0;
> >>  
> >>  	/* We have migrated, no longer consider this task hot */
> >>  	se->exec_start = 0;
> >
> > there you don't remove it first..
> 
> Yeah, the issue is that you can't remove it, because you don't hold the
> lock. Thus the whole runnable/blocked split iirc. Also the
> cfs_rq_clock_task read is incorrect for the same reason (and while
> rq_clock_task could certainly be fixed min_vruntime-style,
> cfs_rq_clock_task would be harder).
> 
> The problem with just working around the clock issue somehow and then using an
> atomic to do this subtraction is that you have no idea when the /cfs_rq/
> last updated - there's no guarantee it is up to date, and if it's not
> then the subtraction is wrong. You can't update it to make it up to date
> like the se->avg, becasue you don't hold any locks. You would need
> decay_counter stuff like the current code, and I'm not certain how well
> that would work out without the runnable/blocked split.

Right; so the current code jumps through a few nasty hoops because of
this. But I think the proposed code got this wrong (understandably).

But yes, we spend a lot of time and effort to remove the rq->lock from
the remote wakeup path, which makes all this very tedious indeed.

Like you said, we can indeed make the time thing work, but the remote
subtraction is going to be messy. Can't seem to come up with anything
sane there either.

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ