lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110216115701.3956.qmail@science.horizon.com>
Date:	16 Feb 2011 06:57:01 -0500
From:	"George Spelvin" <linux@...izon.com>
To:	airlied@...il.com, gorcunov@...il.com
Cc:	a.p.zijlstra@...llo.nl, dzickus@...hat.com, eranian@...gle.com,
	linux-kernel@...r.kernel.org, linux@...izon.com,
	ming.m.lin@...el.com, mingo@...e.hu
Subject: Re: 2.6.38-rc2: Uhhuh. NMI received for unknown reason 2d on CPU 0.

> Ping on this problem, still seeing
> 
> Uhhuh. NMI received for unknown reason 3c on CPU 0.
> Do you have a strange power saving mode enabled?
> Dazed and confused, but trying to continue
> 
> on my Pentium-D system here with latest Linus head.
> 
> its sometimes 3c, sometimes 3d, I'm going to bisect and push for
> reverts if nobody still has any clue about how to fix this.

The second patch (not the one you quote) fixed it for me.  Almost 8 days
of uptime and no log spam.

It's appended below for your convenience.  Are you using this
unsuccessfully?


From: Cyrill Gorcunov <gorcunov@...nvz.org>
Subject: [PATCH] perf, x86: P4 PMU -- Fix unflagged overflows test

A couple of people have reported an unknown NMI issue on p4 pmu.
This patch should fix it.

Reported-by: George Spelvin <linux@...izon.com>
Reported-by: Meelis Roos <mroos@...ux.ee>
Reported-by: Don Zickus <dzickus@...hat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@...nvz.org>
CC: Ingo Molnar <mingo@...e.hu>
CC: Lin Ming <ming.m.lin@...el.com>
CC: Don Zickus <dzickus@...hat.com>
CC: Peter Zijlstra <a.p.zijlstra@...llo.nl>
---
 arch/x86/include/asm/perf_event_p4.h |    1 +
 arch/x86/kernel/cpu/perf_event_p4.c  |   11 ++++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

Index: linux-2.6.tip/arch/x86/include/asm/perf_event_p4.h
===================================================================
--- linux-2.6.tip.orig/arch/x86/include/asm/perf_event_p4.h
+++ linux-2.6.tip/arch/x86/include/asm/perf_event_p4.h
@@ -22,6 +22,7 @@
 
 #define ARCH_P4_CNTRVAL_BITS	(40)
 #define ARCH_P4_CNTRVAL_MASK	((1ULL << ARCH_P4_CNTRVAL_BITS) - 1)
+#define ARCH_P4_UNFLAGGED_BIT	((1ULL) << (ARCH_P4_CNTRVAL_BITS - 1))
 
 #define P4_ESCR_EVENT_MASK	0x7e000000U
 #define P4_ESCR_EVENT_SHIFT	25
Index: linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
===================================================================
--- linux-2.6.tip.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
@@ -770,9 +770,14 @@ static inline int p4_pmu_clear_cccr_ovf(
 		return 1;
 	}
 
-	/* it might be unflagged overflow */
-	rdmsrl(hwc->event_base + hwc->idx, v);
-	if (!(v & ARCH_P4_CNTRVAL_MASK))
+	/*
+	 * at some circumstances the overflow might issue NMI but did
+	 * not set P4_CCCR_OVF bit so since a counter holds a negative value
+	 * we simply check for high bit being set, if it's cleared it means
+	 * the counter has reached zero value and continued counting before
+	 * real NMI signal was received
+	 */
+	if (!(v & ARCH_P4_UNFLAGGED_BIT))
 		return 1;
 
 	return 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ