[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1266606948.2814.62.camel@sbs-t61.sc.intel.com>
Date: Fri, 19 Feb 2010 11:15:48 -0800
From: Suresh Siddha <suresh.b.siddha@...el.com>
To: "svaidy@...ux.vnet.ibm.com" <svaidy@...ux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
"Ma, Ling" <ling.ma@...el.com>,
"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
"ego@...ibm.com" <ego@...ibm.com>
Subject: Re: change in sched cpu_power causing regressions with SCHED_MC
On Fri, 2010-02-19 at 05:03 -0800, Vaidyanathan Srinivasan wrote:
> > - /* Don't want to pull so many tasks that a group would go idle */
> > - max_pull = min(sds->max_load - sds->avg_load,
> > - sds->max_load - sds->busiest_load_per_task);
> > + if (!sds->group_imb) {
> > + /*
> > + * Don't want to pull so many tasks that a group would go idle.
> > + */
> > + load_above_capacity = (sds->busiest_nr_running -
> > + sds->busiest_group_capacity);
> > +
> > + load_above_capacity *= (SCHED_LOAD_SCALE * SCHED_LOAD_SCALE);
> > +
> > + load_above_capacity /= sds->busiest->cpu_power;
> > + }
>
> This seems tricky. max_load - avg_load will be less than
> load_above_capacity most of the time. How does this expression
> increase the max_pull from previous expression?
I am not trying to increase/decrease from the previous expression. Just
trying to do the right thing (to ultimately address smt/mc
power-savings), as the "max_load - busiest_load_per_task" no longer
represents the load above capacity.
>
> > + /*
> > + * We're trying to get all the cpus to the average_load, so we don't
> > + * want to push ourselves above the average load, nor do we wish to
> > + * reduce the max loaded cpu below the average load, as either of these
> > + * actions would just result in more rebalancing later, and ping-pong
> > + * tasks around. Thus we look for the minimum possible imbalance.
> > + * Negative imbalances (*we* are more loaded than anyone else) will
> > + * be counted as no imbalance for these purposes -- we can't fix that
> > + * by pulling tasks to us. Be careful of negative numbers as they'll
> > + * appear as very large values with unsigned longs.
> > + */
> > + max_pull = min(sds->max_load - sds->avg_load, load_above_capacity);
>
> Does this increase or decrease the value of max_pull from previous
> expression?
Does the above help answer your question, Vaidy?
>
> > /* How much load to actually move to equalise the imbalance */
> > *imbalance = min(max_pull * sds->busiest->cpu_power,
> > @@ -4069,19 +4097,6 @@ find_busiest_group(struct sched_domain *sd, int this_cpu,
> > sds.busiest_load_per_task =
> > min(sds.busiest_load_per_task, sds.avg_load);
> >
> > - /*
> > - * We're trying to get all the cpus to the average_load, so we don't
> > - * want to push ourselves above the average load, nor do we wish to
> > - * reduce the max loaded cpu below the average load, as either of these
> > - * actions would just result in more rebalancing later, and ping-pong
> > - * tasks around. Thus we look for the minimum possible imbalance.
> > - * Negative imbalances (*we* are more loaded than anyone else) will
> > - * be counted as no imbalance for these purposes -- we can't fix that
> > - * by pulling tasks to us. Be careful of negative numbers as they'll
> > - * appear as very large values with unsigned longs.
> > - */
> > - if (sds.max_load <= sds.busiest_load_per_task)
> > - goto out_balanced;
>
> This is right. This condition was treating most cases as balanced and
> exit right here. However if this check is removed, we will have to
> execute more code to detect/ascertain balanced case.
To add, in update_sd_lb_stats() we are already doing this:
} else if (sgs.avg_load > sds->max_load &&
(sgs.sum_nr_running > sgs.group_capacity ||
sgs.group_imb)) {
So we are already checking sum_nr_running > group_capacity to select the
busiest group. So we are doing the equivalent of this balanced check
much before.
thanks,
suresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists