linux-kernel - Re: [PATCH v7 11/14] sched/fair: Introduce an energy estimation helper function

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181004090211.ponos72a26ybemjh@queper01-lin>
Date:   Thu, 4 Oct 2018 10:02:13 +0100
From:   Quentin Perret <quentin.perret@....com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     rjw@...ysocki.net, linux-kernel@...r.kernel.org,
        linux-pm@...r.kernel.org, gregkh@...uxfoundation.org,
        mingo@...hat.com, dietmar.eggemann@....com,
        morten.rasmussen@....com, chris.redpath@....com,
        patrick.bellasi@....com, valentin.schneider@....com,
        vincent.guittot@...aro.org, thara.gopinath@...aro.org,
        viresh.kumar@...aro.org, tkjos@...gle.com, joel@...lfernandes.org,
        smuckle@...gle.com, adharmap@...eaurora.org,
        skannan@...eaurora.org, pkondeti@...eaurora.org,
        juri.lelli@...hat.com, edubezval@...il.com,
        srinivas.pandruvada@...ux.intel.com, currojerez@...eup.net,
        javi.merino@...nel.org
Subject: Re: [PATCH v7 11/14] sched/fair: Introduce an energy estimation
 helper function

Hi Peter,

On Thursday 04 Oct 2018 at 10:34:57 (+0200), Peter Zijlstra wrote:
> On Wed, Sep 12, 2018 at 10:13:06AM +0100, Quentin Perret wrote:
> > +static unsigned long cpu_util_next(int cpu, struct task_struct *p, int dst_cpu)
> > +{
> > +	struct cfs_rq *cfs_rq = &cpu_rq(cpu)->cfs;
> > +	unsigned long util_est, util = READ_ONCE(cfs_rq->avg.util_avg);
> > +
> > +	/*
> > +	 * If @p migrates from @cpu to another, remove its contribution. Or,
> > +	 * if @p migrates from another CPU to @cpu, add its contribution. In
> > +	 * the other cases, @cpu is not impacted by the migration, so the
> > +	 * util_avg should already be correct.
> > +	 */
> > +	if (task_cpu(p) == cpu && dst_cpu != cpu)
> > +		util = max_t(long, util - task_util(p), 0);
> 
> That's not quite right; what you want to check for is underflow, but the
> above also results in 0 if util - task_util() > LONG_MAX without an
> underflow.

Hmm, I basically took that from capacity_spare_wake(). So I guess that
function wants fixing too ... :/

Now, is it actually possible to hit that case (util - task_util() > LONG_MAX)
given that util numbers are in the 1024 scale ? I guess that _could_
become a problem if we decided to increase the resolution one day.

> You could write: sub_positive(&util, task_util(p));

Ah right, I forgot about that macro. It seems to have a slightly
stronger semantics than what I need here though. I don't really care if
the intermediate value hits the memory and task_util() already has a
READ_ONCE. Not sure if that's a big deal.

> 
> > +	else if (task_cpu(p) != cpu && dst_cpu == cpu)
> > +		util += task_util(p);
> > +
> > +	if (sched_feat(UTIL_EST)) {
> > +		util_est = READ_ONCE(cfs_rq->avg.util_est.enqueued);
> > +
> > +		/*
> > +		 * During wake-up, the task isn't enqueued yet and doesn't
> > +		 * appear in the cfs_rq->avg.util_est.enqueued of any rq,
> > +		 * so just add it (if needed) to "simulate" what will be
> > +		 * cpu_util() after the task has been enqueued.
> > +		 */
> > +		if (dst_cpu == cpu)
> > +			util_est += _task_util_est(p);
> > +
> > +		util = max(util, util_est);
> > +	}
> > +
> > +	return min_t(unsigned long, util, capacity_orig_of(cpu));
> 
> AFAICT both @util and capacity_orig_of() have 'unsigned long' as type.

Indeed, no need for min_t.

> > +}
> > +
> > +/*
> > + * compute_energy(): Estimates the energy that would be consumed if @p was
> > + * migrated to @dst_cpu. compute_energy() predicts what will be the utilization
> > + * landscape of the * CPUs after the task migration, and uses the Energy Model
> > + * to compute what would be the energy if we decided to actually migrate that
> > + * task.
> > + */
> > +static long compute_energy(struct task_struct *p, int dst_cpu,
> > +							struct perf_domain *pd)
> 
> static long
> compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)

OK.

> > +{
> > +	long util, max_util, sum_util, energy = 0;
> > +	int cpu;
> > +
> > +	while (pd) {
> > +		max_util = sum_util = 0;
> > +		/*
> > +		 * The capacity state of CPUs of the current rd can be driven by
> > +		 * CPUs of another rd if they belong to the same performance
> > +		 * domain. So, account for the utilization of these CPUs too
> > +		 * by masking pd with cpu_online_mask instead of the rd span.
> > +		 *
> > +		 * If an entire performance domain is outside of the current rd,
> > +		 * it will not appear in its pd list and will not be accounted
> > +		 * by compute_energy().
> > +		 */
> > +		for_each_cpu_and(cpu, perf_domain_span(pd), cpu_online_mask) {
> > +			util = cpu_util_next(cpu, p, dst_cpu);
> > +			util = schedutil_freq_util(cpu, util, ENERGY_UTIL);
> > +			max_util = max(util, max_util);
> > +			sum_util += util;
> > +		}
> > +
> > +		energy += em_pd_energy(pd->obj, max_util, sum_util);
> > +		pd = pd->next;
> > +	}
> 
> No real strong preference, but you could write that like:
> 
> 	for (; pd; pd = pd->next) {
> 	}

And OK for that one too. That saves one line. The next patch has the
same pattern, I'll change it too.

Thanks,
Quentin