linux-kernel - Re: change in sched cpu_power causing regressions with SCHED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1266606948.2814.62.camel@sbs-t61.sc.intel.com>
Date:	Fri, 19 Feb 2010 11:15:48 -0800
From:	Suresh Siddha <suresh.b.siddha@...el.com>
To:	"svaidy@...ux.vnet.ibm.com" <svaidy@...ux.vnet.ibm.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	"Ma, Ling" <ling.ma@...el.com>,
	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
	"ego@...ibm.com" <ego@...ibm.com>
Subject: Re: change in sched cpu_power causing regressions with SCHED_MC

On Fri, 2010-02-19 at 05:03 -0800, Vaidyanathan Srinivasan wrote:
> > -	/* Don't want to pull so many tasks that a group would go idle */
> > -	max_pull = min(sds->max_load - sds->avg_load,
> > -			sds->max_load - sds->busiest_load_per_task);
> > +	if (!sds->group_imb) {
> > +		/*
> > + 	 	 * Don't want to pull so many tasks that a group would go idle.
> > +	 	 */
> > +		load_above_capacity = (sds->busiest_nr_running - 
> > +						sds->busiest_group_capacity);
> > +
> > +		load_above_capacity *= (SCHED_LOAD_SCALE * SCHED_LOAD_SCALE);
> > +	
> > +		load_above_capacity /= sds->busiest->cpu_power;
> > +	}
> 
> This seems tricky.  max_load - avg_load will be less than
> load_above_capacity most of the time.  How does this expression
> increase the max_pull from previous expression?

I am not trying to increase/decrease from the previous expression. Just
trying to do the right thing (to ultimately address smt/mc
power-savings), as the "max_load - busiest_load_per_task" no longer
represents the load above capacity.

> 
> > +	/*
> > +	 * We're trying to get all the cpus to the average_load, so we don't
> > +	 * want to push ourselves above the average load, nor do we wish to
> > +	 * reduce the max loaded cpu below the average load, as either of these
> > +	 * actions would just result in more rebalancing later, and ping-pong
> > +	 * tasks around. Thus we look for the minimum possible imbalance.
> > +	 * Negative imbalances (*we* are more loaded than anyone else) will
> > +	 * be counted as no imbalance for these purposes -- we can't fix that
> > +	 * by pulling tasks to us. Be careful of negative numbers as they'll
> > +	 * appear as very large values with unsigned longs.
> > +	 */
> > +	max_pull = min(sds->max_load - sds->avg_load, load_above_capacity);
> 
> Does this increase or decrease the value of max_pull from previous
> expression?

Does the above help answer your question, Vaidy?

>  
> >  	/* How much load to actually move to equalise the imbalance */
> >  	*imbalance = min(max_pull * sds->busiest->cpu_power,
> > @@ -4069,19 +4097,6 @@ find_busiest_group(struct sched_domain *sd, int this_cpu,
> >  		sds.busiest_load_per_task =
> >  			min(sds.busiest_load_per_task, sds.avg_load);
> > 
> > -	/*
> > -	 * We're trying to get all the cpus to the average_load, so we don't
> > -	 * want to push ourselves above the average load, nor do we wish to
> > -	 * reduce the max loaded cpu below the average load, as either of these
> > -	 * actions would just result in more rebalancing later, and ping-pong
> > -	 * tasks around. Thus we look for the minimum possible imbalance.
> > -	 * Negative imbalances (*we* are more loaded than anyone else) will
> > -	 * be counted as no imbalance for these purposes -- we can't fix that
> > -	 * by pulling tasks to us. Be careful of negative numbers as they'll
> > -	 * appear as very large values with unsigned longs.
> > -	 */
> > -	if (sds.max_load <= sds.busiest_load_per_task)
> > -		goto out_balanced;
> 
> This is right.  This condition was treating most cases as balanced and
> exit right here. However if this check is removed, we will have to
> execute more code to detect/ascertain balanced case.

To add, in update_sd_lb_stats() we are already doing this:

               } else if (sgs.avg_load > sds->max_load &&
                           (sgs.sum_nr_running > sgs.group_capacity ||
                                sgs.group_imb)) {

So we are already checking sum_nr_running > group_capacity to select the
busiest group. So we are doing the equivalent of this balanced check
much before.

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/