lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 9 Jun 2007 04:27:10 +0200
From:	Björn Steinbrink <B.Steinbrink@....de>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Andi Kleen <andi@...stfloor.org>,
	"Udo A. Steinberg" <us15@...inf.tu-dresden.de>,
	Michal Piotrowski <michal.k.k.piotrowski@...il.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>, ak@...e.de,
	dzickus@...hat.com
Subject: [PATCH] i386: Fix the K7 NMI watchdog checkbit

On 2007.06.08 22:43:25 +0200, Ingo Molnar wrote:
> 
> * Björn Steinbrink <B.Steinbrink@....de> wrote:
> 
> > Anyway, both are bugs and should be fixed. Maybe we're even lucky and 
> > it fixes your hang. *fingers crossed*
> 
> just to make it clear: the NMI watchdog was working perfectly fine on 
> that box (in v2.6.21 and in dozens of kernel releases before that, for 
> multiple years) before Andi's cleanup patch. So lets find that bug first 
> or revert the cleanups.

Might have been pure luck. ;-) The culprit seems to be commit
b7471c6da94d30d3deadc55986cc38d1ff57f9ca (from Sep 2006), which
introduced the check bit to figure out if a NMI was generated by the
watchdog timer. While the performance counter register on K7 is 64 bits
wide, the upper 16 bits are reserved and thus using bit 63 as the check
bit is wrong. A quick check using /dev/cpu/0/msr shows that
here, the upper 16 bits are zero all the time, chances are that this is
not deterministic and you got a 1 in bit 63 due to some random change.

Björn



The performance counters on K7 are only 48 bits wide, so using bit 63 to
check if the counter overflowed is wrong. Let's use bit 47 instead.

Signed-off-by: Björn Steinbrink <B.Steinbrink@....de>
Cc: Don Zickus <dzickus@...hat.com>
Cc: Andi Kleen <andi@...stfloor.org>
---
diff --git a/arch/i386/kernel/cpu/perfctr-watchdog.c b/arch/i386/kernel/cpu/perfctr-watchdog.c
index 2b04c8f..82c6967 100644
--- a/arch/i386/kernel/cpu/perfctr-watchdog.c
+++ b/arch/i386/kernel/cpu/perfctr-watchdog.c
@@ -294,7 +294,7 @@ static struct wd_ops k7_wd_ops = {
 	.stop = single_msr_stop_watchdog,
 	.perfctr = MSR_K7_PERFCTR0,
 	.evntsel = MSR_K7_EVNTSEL0,
-	.checkbit = 1ULL<<63,
+	.checkbit = 1ULL<<47,
 };
 
 /* Intel Model 6 (PPro+,P2,P3,P-M,Core1) */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ