linux-kernel - Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 9 Sep 2015 12:13:10 +0100
From:	Morten Rasmussen <morten.rasmussen@....com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steve Muckle <steve.muckle@...aro.org>,
	"mingo@...hat.com" <mingo@...hat.com>,
	"daniel.lezcano@...aro.org" <daniel.lezcano@...aro.org>,
	"yuyang.du@...el.com" <yuyang.du@...el.com>,
	"mturquette@...libre.com" <mturquette@...libre.com>,
	"rjw@...ysocki.net" <rjw@...ysocki.net>,
	Juri Lelli <Juri.Lelli@....com>,
	"sgurrappadi@...dia.com" <sgurrappadi@...dia.com>,
	"pang.xunlei@....com.cn" <pang.xunlei@....com.cn>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by
 capacity_orig

On Wed, Sep 09, 2015 at 11:43:05AM +0200, Peter Zijlstra wrote:
> On Tue, Sep 08, 2015 at 05:53:31PM +0100, Morten Rasmussen wrote:
> > On Tue, Sep 08, 2015 at 03:31:58PM +0100, Morten Rasmussen wrote:
> 
> > > On Tue, Sep 08, 2015 at 02:52:05PM +0200, Peter Zijlstra wrote:
> > > But if we apply the scaling to the weight instead of time, we would only
> > > have to apply it once and not three times like it is now? So maybe we
> > > can end up with almost the same number of multiplications.
> > > 
> > > We might be loosing bits for low priority task running on cpus at a low
> > > frequency though.
> > 
> > Something like the below. We should be saving one multiplication.
> 
> > @@ -2577,8 +2575,13 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> >  		return 0;
> >  	sa->last_update_time = now;
> >  
> > -	scale_freq = arch_scale_freq_capacity(NULL, cpu);
> > -	scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
> > +	if (weight || running)
> > +		scale_freq = arch_scale_freq_capacity(NULL, cpu);
> > +	if (weight)
> > +		scaled_weight = weight * scale_freq >> SCHED_CAPACITY_SHIFT;
> > +	if (running)
> > +		scale_freq_cpu = scale_freq * arch_scale_cpu_capacity(NULL, cpu)
> > +							>> SCHED_CAPACITY_SHIFT;
> >  
> >  	/* delta_w is the amount already accumulated against our next period */
> >  	delta_w = sa->period_contrib;
> > @@ -2594,16 +2597,15 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> >  		 * period and accrue it.
> >  		 */
> >  		delta_w = 1024 - delta_w;
> > -		scaled_delta_w = cap_scale(delta_w, scale_freq);
> >  		if (weight) {
> > -			sa->load_sum += weight * scaled_delta_w;
> > +			sa->load_sum += scaled_weight * delta_w;
> >  			if (cfs_rq) {
> >  				cfs_rq->runnable_load_sum +=
> > -						weight * scaled_delta_w;
> > +						scaled_weight * delta_w;
> >  			}
> >  		}
> >  		if (running)
> > -			sa->util_sum += scaled_delta_w * scale_cpu;
> > +			sa->util_sum += delta_w * scale_freq_cpu;
> >  
> >  		delta -= delta_w;
> >  
> 
> Sadly that makes the code worse; I get 14 mul instructions where
> previously I had 11.
> 
> What happens is that GCC gets confused and cannot constant propagate the
> new variables, so what used to be shifts now end up being actual
> multiplications.
> 
> With this, I get back to 11. Can you see what happens on ARM where you
> have both functions defined to non constants?

We repeated the experiment on arm and arm64 but still with functions
defined to constant to compare with your results. The mul instruction
count seems to be somewhat compiler version dependent, but consistently
show no effect of the patch:

arm	before	after
gcc4.9	12	12
gcc4.8	10	10

arm64	before	after
gcc4.9	11	11

I will get numbers with the arch-functions implemented as well and do
hackbench runs to see what happens in terms of performance.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/