[<prev] [next>] [day] [month] [year] [list]
Message-ID: <tip-b417c9fd8690637f0c91479435ab3e2bf450c038@git.kernel.org>
Date: Tue, 22 Sep 2009 13:34:57 GMT
From: tip-bot for Ingo Molnar <mingo@...e.hu>
To: linux-tip-commits@...r.kernel.org
Cc: linux-kernel@...r.kernel.org, hpa@...or.com, mingo@...hat.com,
seto.hidetoshi@...fujitsu.com, ying.huang@...el.com,
ak@...ux.intel.com, tglx@...utronix.de, mingo@...e.hu
Subject: [tip:x86/urgent] x86: mce: Fix thermal throttling message storm
Commit-ID: b417c9fd8690637f0c91479435ab3e2bf450c038
Gitweb: http://git.kernel.org/tip/b417c9fd8690637f0c91479435ab3e2bf450c038
Author: Ingo Molnar <mingo@...e.hu>
AuthorDate: Tue, 22 Sep 2009 15:50:24 +0200
Committer: Ingo Molnar <mingo@...e.hu>
CommitDate: Tue, 22 Sep 2009 17:30:45 +0200
x86: mce: Fix thermal throttling message storm
If a system switches back and forth between hot and cold mode,
the MCE code will print a stream of critical kernel messages.
Extend the throttling code to properly notice this, by
only printing the first hot + cold transition and omitting
the rest up to CHECK_INTERVAL (5 minutes).
This way we'll only get a single incident of:
[ 102.356584] CPU0: Temperature above threshold, cpu clock throttled (total events = 1)
[ 102.357000] Disabling lock debugging due to kernel taint
[ 102.369223] CPU0: Temperature/speed normal
Every 5 minutes. The 'total events' count tells the number of cold/hot
transitions detected, should overheating occur after 5 minutes again:
[ 402.357580] CPU0: Temperature above threshold, cpu clock throttled (total events = 24891)
[ 402.358001] CPU0: Temperature/speed normal
[ 450.704142] Machine check events logged
Cc: Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>
Cc: Huang Ying <ying.huang@...el.com>
Cc: Andi Kleen <ak@...ux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@...e.hu>
---
arch/x86/kernel/cpu/mcheck/therm_throt.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index db80b57..b3a1dba 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -42,6 +42,7 @@ struct thermal_state {
u64 next_check;
unsigned long throttle_count;
+ unsigned long last_throttle_count;
};
static DEFINE_PER_CPU(struct thermal_state, thermal_state);
@@ -120,11 +121,12 @@ static int therm_throt_process(bool is_throttled)
if (is_throttled)
state->throttle_count++;
- if (!(was_throttled ^ is_throttled) &&
- time_before64(now, state->next_check))
+ if (time_before64(now, state->next_check) &&
+ state->throttle_count != state->last_throttle_count)
return 0;
state->next_check = now + CHECK_INTERVAL;
+ state->last_throttle_count = state->throttle_count;
/* if we just entered the thermal event */
if (is_throttled) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists