[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1266341325.9432.283.camel@laptop>
Date: Tue, 16 Feb 2010 18:28:44 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: svaidy@...ux.vnet.ibm.com
Cc: Suresh Siddha <suresh.b.siddha@...el.com>,
Ingo Molnar <mingo@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
"Ma, Ling" <ling.ma@...el.com>,
"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>, ego@...ibm.com
Subject: Re: [patch] sched: fix SMT scheduler regression in
find_busiest_queue()
On Tue, 2010-02-16 at 21:29 +0530, Vaidyanathan Srinivasan wrote:
> Agreed. Placement control should be handled by SD_PREFER_SIBLING
> and SD_POWER_SAVINGS flags.
>
> --Vaidy
>
> ---
>
> sched_smt_powersavings for threaded systems need this fix for
> consolidation to sibling threads to work. Since threads have
> fractional capacity, group_capacity will turn out to be one
> always and not accommodate another task in the sibling thread.
>
> This fix makes group_capacity a function of cpumask_weight that
> will enable the power saving load balancer to pack tasks among
> sibling threads and keep more cores idle.
>
> Signed-off-by: Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>
>
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index 522cf0e..ec3a5c5 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -2538,9 +2538,17 @@ static inline void update_sd_lb_stats(struct sched_domain *sd, int this_cpu,
> * In case the child domain prefers tasks go to siblings
> * first, lower the group capacity to one so that we'll try
> * and move all the excess tasks away.
I prefer a blank line in between two paragraphs, but even better would
be to place this comment at the else if site.
> + * If power savings balance is set at this domain, then
> + * make capacity equal to number of hardware threads to
> + * accomodate more tasks until capacity is reached. The
my spell checker seems to prefer: accommodate
> + * default is fractional capacity for sibling hardware
> + * threads for fair use of available hardware resources.
> */
> if (prefer_sibling)
> sgs.group_capacity = min(sgs.group_capacity, 1UL);
> + else if (sd->flags & SD_POWERSAVINGS_BALANCE)
> + sgs.group_capacity =
> + cpumask_weight(sched_group_cpus(group));
I guess we should apply cpu_active_mask so that we properly deal with
offline siblings, except with cpumasks being the beasts they are I see
no cheap way to do that.
> if (local_group) {
> sds->this_load = sgs.avg_load;
> @@ -2855,7 +2863,8 @@ static int need_active_balance(struct sched_domain *sd, int sd_idle, int idle)
> !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
> return 0;
>
> - if (sched_mc_power_savings < POWERSAVINGS_BALANCE_WAKEUP)
> + if (sched_mc_power_savings < POWERSAVINGS_BALANCE_WAKEUP &&
> + sched_smt_power_savings < POWERSAVINGS_BALANCE_WAKEUP)
> return 0;
> }
/me still hopes for that unification patch.. :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists