[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090320043437.GA6603@in.ibm.com>
Date: Fri, 20 Mar 2009 10:04:37 +0530
From: Gautham R Shenoy <ego@...ibm.com>
To: Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
Suresh Siddha <suresh.b.siddha@...el.com>,
Balbir Singh <balbir@...ibm.com>
Subject: Re: [PATCH 3 2/6] sched: Record the current active power savings
level
On Thu, Mar 19, 2009 at 10:02:05PM +0530, Vaidyanathan Srinivasan wrote:
> * Gautham R Shenoy <ego@...ibm.com> [2009-03-18 14:52:28]:
>
> > The current active power savings level of a system is defined as the
> > maximum of the sched_mc_power_savings and the sched_smt_power_savings.
> >
> > The decisions during power-aware loadbalancing, depend on this value.
> >
> > Record this value in a read mostly global variable instead of having to
> > compute it everytime.
> >
> > Signed-off-by: Gautham R Shenoy <ego@...ibm.com>
> > Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> > ---
> >
> > include/linux/sched.h | 1 +
> > kernel/sched.c | 8 ++++++--
> > kernel/sched_fair.c | 2 +-
> > 3 files changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 37fecf7..7dc8aea 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -793,6 +793,7 @@ enum powersavings_balance_level {
> > };
> >
> > extern int sched_mc_power_savings, sched_smt_power_savings;
>
> Now we will need sched_mc_power_savings and sched_smt_power_savings
> only until we rebuild the sched domains. These can be static
> variables and we can perhaps remove the extern for them? Better still
> if we can capture this information elsewhere until sched domain is
> built and SD_POWERSAVINGS_BALANCE flags are set so as to not
> have a need for these global variables.
Right now we need these variables outside sched.c only
while rebuilding the sched_domains, as you rightly pointed out. So, yes
these variables should be _read_only outside sched.c, but they are
required nevertheless.
However, like we had discussed in one of the earlier posts,
when we can have a single tunable that can capture in
essence what these two variables seek to achieve, we can get rid of
these variables. Till then I think we'll have to retain them.
>
> > +extern enum powersavings_balance_level active_power_savings_level;
> > enum sched_domain_level {
> > SD_LV_NONE = 0,
> > diff --git a/kernel/sched.c b/kernel/sched.c
> > index 8e2558c..407ee03 100644
> > --- a/kernel/sched.c
> > +++ b/kernel/sched.c
> > @@ -3398,7 +3398,7 @@ out_balanced:
> >
> > if (this == group_leader && group_leader != group_min) {
> > *imbalance = min_load_per_task;
> > - if (sched_mc_power_savings >= POWERSAVINGS_BALANCE_WAKEUP) {
> > + if (active_power_savings_level >= POWERSAVINGS_BALANCE_WAKEUP) {
> > cpu_rq(this_cpu)->rd->sched_mc_preferred_wakeup_cpu =
> > cpumask_first(sched_group_cpus(group_leader));
> > }
> > @@ -3683,7 +3683,7 @@ redo:
> > !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
> > return -1;
> >
> > - if (sched_mc_power_savings < POWERSAVINGS_BALANCE_WAKEUP)
> > + if (active_power_savings_level < POWERSAVINGS_BALANCE_WAKEUP)
> > return -1;
> >
> > if (sd->nr_balance_failed++ < 2)
> > @@ -7206,6 +7206,8 @@ static void sched_domain_node_span(int node, struct cpumask *span)
> > #endif /* CONFIG_NUMA */
> >
> > int sched_smt_power_savings = 0, sched_mc_power_savings = 0;
> > +/* Records the currently active power savings level */
> > +enum powersavings_balance_level __read_mostly active_power_savings_level;
> >
> > /*
> > * The cpus mask in sched_group and sched_domain hangs off the end.
> > @@ -8040,6 +8042,8 @@ static ssize_t sched_power_savings_store(const char *buf, size_t count, int smt)
> > sched_smt_power_savings = level;
> > else
> > sched_mc_power_savings = level;
> > + active_power_savings_level = max(sched_smt_power_savings,
> > + sched_mc_power_savings);
> >
> > arch_reinit_sched_domains();
> >
> > diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> > index 0566f2a..a3583c6 100644
> > --- a/kernel/sched_fair.c
> > +++ b/kernel/sched_fair.c
> > @@ -1054,7 +1054,7 @@ static int wake_idle(int cpu, struct task_struct *p)
> > chosen_wakeup_cpu =
> > cpu_rq(this_cpu)->rd->sched_mc_preferred_wakeup_cpu;
> >
> > - if (sched_mc_power_savings >= POWERSAVINGS_BALANCE_WAKEUP &&
> > + if (active_power_savings_level >= POWERSAVINGS_BALANCE_WAKEUP &&
> > idle_cpu(cpu) && idle_cpu(this_cpu) &&
> > p->mm && !(p->flags & PF_KTHREAD) &&
> > cpu_isset(chosen_wakeup_cpu, p->cpus_allowed))
> >
>
>
> Acked-by: Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>
--
Thanks and Regards
gautham
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists