lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 1 Aug 2006 19:51:09 -0400
From:	Dave Jones <davej@...hat.com>
To:	Andreas Schwab <schwab@...e.de>
Cc:	Alexey Dobriyan <adobriyan@...il.com>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: single bit flip detector.

On Wed, Aug 02, 2006 at 01:28:49AM +0200, Andreas Schwab wrote:
 > Dave Jones <davej@...hat.com> writes:
 > 
 > > diff --git a/mm/slab.c b/mm/slab.c
 > > index 21ba060..39f1183 100644
 > > --- a/mm/slab.c
 > > +++ b/mm/slab.c
 > > @@ -1638,10 +1638,29 @@ static void poison_obj(struct kmem_cache
 > >  static void dump_line(char *data, int offset, int limit)
 > >  {
 > >  	int i;
 > > +	unsigned char total = 0, bad_count = 0, errors = 0;
 > 
 > No need to initialize errors here.

I'm going for the record of 'most times a patch gets submitted in one day'.
And to think we were complaining that patches don't get enough review ? :)
If every change had this much polish, we'd be awesome.

		Dave


In case where we detect a single bit has been flipped, we spew
the usual slab corruption message, which users instantly think
is a kernel bug.  In a lot of cases, single bit errors are
down to bad memory, or other hardware failure.

This patch adds an extra line to the slab debug messages
in those cases, in the hope that users will try memtest before
they report a bug.

000: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
Single bit error detected. Possibly bad RAM. Run memtest86.

Signed-off-by: Dave Jones <davej@...hat.com>

diff --git a/mm/slab.c b/mm/slab.c
index 21ba060..39f1183 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -1638,10 +1638,28 @@ static void poison_obj(struct kmem_cache
 static void dump_line(char *data, int offset, int limit)
 {
 	int i;
+	unsigned char total = 0, bad_count = 0, errors;
 	printk(KERN_ERR "%03x:", offset);
-	for (i = 0; i < limit; i++)
+	for (i = 0; i < limit; i++) {
+		if (data[offset + i] != POISON_FREE) {
+			total += data[offset + i];
+			bad_count++;
+		}
 		printk(" %02x", (unsigned char)data[offset + i]);
+	}
 	printk("\n");
+
+	if (bad_count == 1) {
+		errors = total ^ POISON_FREE;
+		if (errors && !(errors & (errors-1))) {
+			printk (KERN_ERR "Single bit error detected. Probably bad RAM.\n");
+#ifdef CONFIG_X86
+			printk (KERN_ERR "Run memtest86+ or a similar memory test tool.\n");
+#else
+			printk (KERN_ERR "Run a memory test tool.\n");
+#endif
+		}
+	}
 }
 #endif
 

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ