[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100503204818.7b801f43@infradead.org>
Date: Mon, 3 May 2010 20:48:18 -0700
From: Arjan van de Ven <arjan@...radead.org>
To: Thomas Renninger <trenn@...e.de>
Cc: Willy Tarreau <w@....eu>, Pavel Machek <pavel@....cz>,
linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
mingo@...e.hu, peterz@...radead.org, tglx@...utronix.de,
davej@...hat.com, cpufreq@...r.kernel.org, riel@...hat.com
Subject: [PATCH 8/7] cpufreq: make the iowait-is-busy-time a sysfs tunable
On Tue, 27 Apr 2010 13:39:34 +0200
Thomas Renninger <trenn@...e.de> wrote:
> On Friday 23 April 2010 06:08:19 pm Arjan van de Ven wrote:
> > On Fri, 23 Apr 2010 10:50:10 +0200
> > Thomas Renninger <trenn@...e.de> wrote:
> > Especially on battery, users will appreciate some minutes
> >
> > > of more battery lifetime and do not care about some ms of IO
> > > latencies.
> >
> > the assumption that power doesn't matter on AC is a huge fiction
> > that any data center operator would love to get out of everyones
> > head as quickly as possible.
>
> Have I said power doesn't matter on AC?
> Do you agree that a datacenter has different performance vs power
> tradeoff demands as a battery driven mobile device?
>
> Back to the topic:
> As you did not answer on my (several) sysfs knob request(s), I expect
> you agree with it and will add one.
>
yup it makes sense to have a sysfs knob with a sane default value
From: Arjan van de Ven <arjan@...ux.intel.com>
Subject: [PATCH] cpufreq: make the iowait-is-busy-time a sysfs tunable
Pavel Machek pointed out that not all CPUs have an efficient idle
at high frequency. Specifically, older Intel and various AMD cpus
would get a higher power usage when copying files from USB.
Mike Chan pointed out that the same is true for various ARM chips
as well.
Thomas Renninger suggested to make this a sysfs tunable with a
reasonable default.
This patch adds a sysfs tunable for the new behavior, and uses
a very simple function to determine a reasonable default, depending
on the CPU vendor/type.
Signed-off-by: Arjan van de Ven <arjan@...ux.intel.com>
---
drivers/cpufreq/cpufreq_ondemand.c | 46 +++++++++++++++++++++++++++++++++++-
1 files changed, 45 insertions(+), 1 deletions(-)
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index ed472f8..4877e8f 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -109,6 +109,7 @@ static struct dbs_tuners {
unsigned int down_differential;
unsigned int ignore_nice;
unsigned int powersave_bias;
+ unsigned int io_is_busy;
} dbs_tuners_ins = {
.up_threshold = DEF_FREQUENCY_UP_THRESHOLD,
.down_differential = DEF_FREQUENCY_DOWN_DIFFERENTIAL,
@@ -260,6 +261,7 @@ static ssize_t show_##file_name \
return sprintf(buf, "%u\n", dbs_tuners_ins.object); \
}
show_one(sampling_rate, sampling_rate);
+show_one(io_is_busy, io_is_busy);
show_one(up_threshold, up_threshold);
show_one(ignore_nice_load, ignore_nice);
show_one(powersave_bias, powersave_bias);
@@ -310,6 +312,22 @@ static ssize_t store_sampling_rate(struct kobject *a, struct attribute *b,
return count;
}
+static ssize_t store_io_is_busy(struct kobject *a, struct attribute *b,
+ const char *buf, size_t count)
+{
+ unsigned int input;
+ int ret;
+ ret = sscanf(buf, "%u", &input);
+ if (ret != 1)
+ return -EINVAL;
+
+ mutex_lock(&dbs_mutex);
+ dbs_tuners_ins.io_is_busy = !!input;
+ mutex_unlock(&dbs_mutex);
+
+ return count;
+}
+
static ssize_t store_up_threshold(struct kobject *a, struct attribute *b,
const char *buf, size_t count)
{
@@ -392,6 +410,7 @@ static struct global_attr _name = \
__ATTR(_name, 0644, show_##_name, store_##_name)
define_one_rw(sampling_rate);
+define_one_rw(io_is_busy);
define_one_rw(up_threshold);
define_one_rw(ignore_nice_load);
define_one_rw(powersave_bias);
@@ -403,6 +422,7 @@ static struct attribute *dbs_attributes[] = {
&up_threshold.attr,
&ignore_nice_load.attr,
&powersave_bias.attr,
+ &io_is_busy.attr,
NULL
};
@@ -527,7 +547,7 @@ static void dbs_check_cpu(struct cpu_dbs_info_s *this_dbs_info)
* from the cpu idle time.
*/
- if (idle_time >= iowait_time)
+ if (dbs_tuners_ins.io_is_busy && idle_time >= iowait_time)
idle_time -= iowait_time;
if (unlikely(!wall_time || wall_time < idle_time))
@@ -643,6 +663,29 @@ static inline void dbs_timer_exit(struct cpu_dbs_info_s *dbs_info)
cancel_delayed_work_sync(&dbs_info->work);
}
+/*
+ * Not all CPUs want IO time to be accounted as busy; this depends on how
+ * efficient idling at a higher frequency/voltage is.
+ * Pavel Machek says this is not so for various generations of AMD and old
+ * Intel systems.
+ * Mike Chan (android.com) says this is also not true for ARM.
+ * Because of this, whitelist specific known (series) of CPUs by default, and
+ * leave all others up to the user.
+ */
+static int should_io_be_busy(void)
+{
+#if defined(CONFIG_X86)
+ /*
+ * For Intel, Core 2 (model 15) and later have an efficient idle.
+ */
+ if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+ boot_cpu_data.x86 == 6 &&
+ boot_cpu_data.x86_model >= 15)
+ return 1;
+#endif
+ return 0;
+}
+
static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
unsigned int event)
{
@@ -705,6 +748,7 @@ static int cpufreq_governor_dbs(struct cpufreq_policy *policy,
dbs_tuners_ins.sampling_rate =
max(min_sampling_rate,
latency * LATENCY_MULTIPLIER);
+ dbs_tuners_ins.io_is_busy = should_io_be_busy();
}
mutex_unlock(&dbs_mutex);
--
1.6.1.3
--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists