[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110216115701.3956.qmail@science.horizon.com>
Date: 16 Feb 2011 06:57:01 -0500
From: "George Spelvin" <linux@...izon.com>
To: airlied@...il.com, gorcunov@...il.com
Cc: a.p.zijlstra@...llo.nl, dzickus@...hat.com, eranian@...gle.com,
linux-kernel@...r.kernel.org, linux@...izon.com,
ming.m.lin@...el.com, mingo@...e.hu
Subject: Re: 2.6.38-rc2: Uhhuh. NMI received for unknown reason 2d on CPU 0.
> Ping on this problem, still seeing
>
> Uhhuh. NMI received for unknown reason 3c on CPU 0.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
>
> on my Pentium-D system here with latest Linus head.
>
> its sometimes 3c, sometimes 3d, I'm going to bisect and push for
> reverts if nobody still has any clue about how to fix this.
The second patch (not the one you quote) fixed it for me. Almost 8 days
of uptime and no log spam.
It's appended below for your convenience. Are you using this
unsuccessfully?
From: Cyrill Gorcunov <gorcunov@...nvz.org>
Subject: [PATCH] perf, x86: P4 PMU -- Fix unflagged overflows test
A couple of people have reported an unknown NMI issue on p4 pmu.
This patch should fix it.
Reported-by: George Spelvin <linux@...izon.com>
Reported-by: Meelis Roos <mroos@...ux.ee>
Reported-by: Don Zickus <dzickus@...hat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@...nvz.org>
CC: Ingo Molnar <mingo@...e.hu>
CC: Lin Ming <ming.m.lin@...el.com>
CC: Don Zickus <dzickus@...hat.com>
CC: Peter Zijlstra <a.p.zijlstra@...llo.nl>
---
arch/x86/include/asm/perf_event_p4.h | 1 +
arch/x86/kernel/cpu/perf_event_p4.c | 11 ++++++++---
2 files changed, 9 insertions(+), 3 deletions(-)
Index: linux-2.6.tip/arch/x86/include/asm/perf_event_p4.h
===================================================================
--- linux-2.6.tip.orig/arch/x86/include/asm/perf_event_p4.h
+++ linux-2.6.tip/arch/x86/include/asm/perf_event_p4.h
@@ -22,6 +22,7 @@
#define ARCH_P4_CNTRVAL_BITS (40)
#define ARCH_P4_CNTRVAL_MASK ((1ULL << ARCH_P4_CNTRVAL_BITS) - 1)
+#define ARCH_P4_UNFLAGGED_BIT ((1ULL) << (ARCH_P4_CNTRVAL_BITS - 1))
#define P4_ESCR_EVENT_MASK 0x7e000000U
#define P4_ESCR_EVENT_SHIFT 25
Index: linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
@@ -770,9 +770,14 @@ static inline int p4_pmu_clear_cccr_ovf(
return 1;
}
- /* it might be unflagged overflow */
- rdmsrl(hwc->event_base + hwc->idx, v);
- if (!(v & ARCH_P4_CNTRVAL_MASK))
+ /*
+ * at some circumstances the overflow might issue NMI but did
+ * not set P4_CCCR_OVF bit so since a counter holds a negative value
+ * we simply check for high bit being set, if it's cleared it means
+ * the counter has reached zero value and continued counting before
+ * real NMI signal was received
+ */
+ if (!(v & ARCH_P4_UNFLAGGED_BIT))
return 1;
return 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists