[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1291071677.32004.527.camel@laptop>
Date: Tue, 30 Nov 2010 00:01:17 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: tmhikaru@...il.com
Cc: Damien Wyart <damien.wyart@...e.fr>,
Venkatesh Pallipadi <venki@...gle.com>,
Chase Douglas <chase.douglas@...onical.com>,
Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org, Kyle McMartin <kyle@...artin.ca>
Subject: Re: High CPU load when machine is idle (related to PROBLEM:
Unusually high load average when idle in 2.6.35, 2.6.35.1 and later)
On Mon, 2010-11-29 at 14:40 -0500, tmhikaru@...il.com wrote:
> On Mon, Nov 29, 2010 at 12:38:46PM +0100, Peter Zijlstra wrote:
> > On Sun, 2010-11-28 at 12:40 +0100, Damien Wyart wrote:
> > > Hi,
> > >
> > > * Peter Zijlstra <peterz@...radead.org> [2010-11-27 21:15]:
> > > > How does this work for you? Its hideous but lets start simple.
> > > > [...]
> > >
> > > Doesn't give wrong numbers like initial bug and tentative patches, but
> > > feels a bit too slow when numbers go up and down. Correct values are
> > > reached when waiting long enough, but it feels slow.
> > >
> > > As I've tested many combinations, maybe this is an impression because
> > > I do not remember about "normal" delays for the load to rise and fall,
> > > but this still feels slow.
> >
> > You can test this by either booting with nohz=off, or builting with
> > CONFIG_NO_HZ=n and then comparing the result, something like
> >
> > make O=defconfig clean; while sleep 10; do uptime >> load.log; done &
> > make -j32 O=defconfig; kill %1
> >
> > And comparing the curves between the NO_HZ and !NO_HZ kernels.
> >
> > I'll try and make the patch less hideous ;-)
>
> I've tested this patch on my own use case, and it seems to work for the most
> part - it's still not settling as low as the previous implementation used
> to, nor is it settling as low as CONFIG_NO_HZ=N (that is to say, 0.00 across
> the board when not being used) however, this is definitely an improvement:
>
> 14:26:04 up 9:08, 5 users, load average: 0.05, 0.01, 0.00
>
> This is the result of running uptime on a checked out version of
> [74f5187ac873042f502227701ed1727e7c5fbfa9] sched: Cure load average vs NO_HZ woes
>
> with the patch applied, starting X, and simply letting the machine sit idle
> for nine hours. For the brief period I spent watching it after boot, it
> quickly began settling down to a reasonable value, I only let it sit idle
> this long to verify the loadavg was consistently low. (the loadavg was
> consistently erratic, anywhere from 0.6 to 1.2 with the machine idle without
> this patch)
Ok, that's good testing.. so its still not quite the same as NO_HZ=n,
how about this one?
(it seems to drop down to 0.00 if I wait a few minutes with top -d5)
---
kernel/sched.c | 5 +++++
kernel/time/tick-sched.c | 4 +++-
kernel/timer.c | 12 ++++++++++++
3 files changed, 20 insertions(+), 1 deletions(-)
diff --git a/kernel/sched.c b/kernel/sched.c
index 864040c..a859158 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -3082,6 +3082,11 @@ static void calc_load_account_active(struct rq *this_rq)
this_rq->calc_load_update += LOAD_FREQ;
}
+void calc_load_account_this(void)
+{
+ calc_load_account_active(this_rq());
+}
+
/*
* The exact cpuload at various idx values, calculated at every tick would be
* load = (2^idx - 1) / 2^idx * load + 1 / 2^idx * cur_load
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 3e216e0..1e6d384 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -41,6 +41,8 @@ struct tick_sched *tick_get_tick_sched(int cpu)
return &per_cpu(tick_cpu_sched, cpu);
}
+extern void do_timer_nohz(unsigned long ticks);
+
/*
* Must be called with interrupts disabled !
*/
@@ -75,7 +77,7 @@ static void tick_do_update_jiffies64(ktime_t now)
last_jiffies_update = ktime_add_ns(last_jiffies_update,
incr * ticks);
}
- do_timer(++ticks);
+ do_timer_nohz(++ticks);
/* Keep the tick_next_period variable up to date */
tick_next_period = ktime_add(last_jiffies_update, tick_period);
diff --git a/kernel/timer.c b/kernel/timer.c
index d6ccb90..eb2646f 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1300,6 +1300,18 @@ void do_timer(unsigned long ticks)
calc_global_load();
}
+extern void calc_load_account_this(void);
+
+void do_timer_nohz(unsigned long ticks)
+{
+ while (ticks--) {
+ jiffies_64++;
+ calc_load_account_this();
+ calc_global_load();
+ }
+ update_wall_time();
+}
+
#ifdef __ARCH_WANT_SYS_ALARM
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists