[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080626210025.GB26167@in.ibm.com>
Date: Fri, 27 Jun 2008 02:30:25 +0530
From: Dipankar Sarma <dipankar@...ibm.com>
To: Andi Kleen <andi@...stfloor.org>
Cc: balbir@...ux.vnet.ibm.com,
Linux Kernel <linux-kernel@...r.kernel.org>,
Suresh B Siddha <suresh.b.siddha@...el.com>,
Venkatesh Pallipadi <venkatesh.pallipadi@...el.com>,
Ingo Molnar <mingo@...e.hu>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Vatsa <vatsa@...ux.vnet.ibm.com>,
Gautham R Shenoy <ego@...ibm.com>
Subject: Re: [RFC v1] Tunable sched_mc_power_savings=n
On Thu, Jun 26, 2008 at 10:17:00PM +0200, Andi Kleen wrote:
> Vaidyanathan Srinivasan wrote:
> > System management software and workload monitoring and managing
> > software can potentially control the tunable on behalf of the
> > applications for best overall power savings and performance.
>
> Does it have the needed information for that? e.g. real time information
> on what the system does? I don't think anybody is in a better position
> to control that than the kernel.
Some workload managers already do that - they provision cpu and memory
resources based on request rates and response times. Such software is
in a better position to make a decision whether they can live with
reduced performance due to power saving mode or not. The point I am
making is the the kernel doesn't have any notion of transactional
performance - so if an administrator wants to run unimportant
transactions on a slower but low-power system, he/she should have
the option of doing so.
> > Applications with conflicting goals should resolve among themselves.
>
> That sounds wrong to me. Negotiating between conflicting requirements
> from different applications is something that kernels are supposed
> to do.
Agreed. However that is a difficult problem to solve and not the
intention of this idea. Global power setting is a simple first step.
I don't think we have a good understanding of cases where conflicting
power requirements from multiple applications need to be addressed.
We will have to look at that when the issue arises.
> > In a small-scale datacenters, peak and off-peak hour settings can be
> > potentially done through simple cron jobs.
>
> Is there any real drawback from only controlling it through nice levels?
In a system with more than a couple of sockets, it is more beneficial
(power-wise) to pack all work in to a small number of processors
and let the other processors go to very low power sleep. Compared
to running tasks slowly and spreading them all over the processors.
> Anyways I think the main thing I object to in your proposal is that
> your tunable is system global, not per process. I'm also not
> sure if a tunable is really a good idea and if the kernel couldn't
> do a better job.
While it would be nice to have a per process tunable, I am not sure
we are ready for that yet. A global setting is easy to implement
and we have immediate use for it. The kernel already does a decent
job conservatively - by packing one task per core in a package
when sched_mc_power_savings=1 is set. Any further packing may affect
performance and should not therefore be the default behavior.
Thanks
Dipankar
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists