[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D8BAB49.3080701@openvz.org>
Date: Thu, 24 Mar 2011 23:36:25 +0300
From: Cyrill Gorcunov <gorcunov@...nvz.org>
To: Ingo Molnar <mingo@...e.hu>
CC: Don Zickus <dzickus@...hat.com>, Lin Ming <ming.m.lin@...el.com>,
lkml <linux-kernel@...r.kernel.org>
Subject: [PATCH] perf, x86: P4 PMU - Read proper MSR register to catch unflagged
overflows
From: Don Zickus <dzickus@...hat.com>
Subject: [PATCH -tip] perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
The read of a proper MSR register was missed and instead of counter the
configration register was tested (it has ARCH_P4_UNFLAGGED_BIT always
cleared) leading to unknown NMI hitting the system. As result the user may
obtain "Dazed and confused, but trying to continue" message. Fix it by reading
a proper MSR register.
When an NMI happens on a P4, the perf nmi handler checks the configuration
register to see if the overflow bit is set or not before taking
appropriate action. Unfortunately, various P4 machines had a broken
overflow bit, so a backup mechanism was implemented. This mechanism
checked to see if the counter rolled over or not.
A previous commit that implemented this backup mechanism was broken.
Instead of reading the counter register, it used the configuration
register to determine if the counter rolled over or not. Reading that bit
would give incorrect results.
This would lead to 'Dazed and confused' messages for the end user when
using the perf tool (or if the nmi watchdog is running).
The fix is to read the counter register before determining if the counter
rolled over or not.
Signed-off-by: Don Zickus <dzickus@...hat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@...nvz.org>
CC: Lin Ming <ming.m.lin@...el.com>
---
arch/x86/kernel/cpu/perf_event_p4.c | 1 +
1 file changed, 1 insertion(+)
Index: linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
@@ -777,6 +777,7 @@ static inline int p4_pmu_clear_cccr_ovf(
* the counter has reached zero value and continued counting before
* real NMI signal was received:
*/
+ rdmsrl(hwc->event_base, v);
if (!(v & ARCH_P4_UNFLAGGED_BIT))
return 1;
--
Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists