[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <p73bqpd62b2.fsf@verdi.suse.de>
Date: 18 Sep 2006 09:50:41 +0200
From: Andi Kleen <ak@...e.de>
To: Robin Lee Powell <rlpowell@...italkingdom.org>
Cc: linux-kernel@...r.kernel.org
Subject: Re: Early boot hang on recent 2.6 kernels (> 2.6.3), on x86-64 with 16gb of RAM
Robin Lee Powell <rlpowell@...italkingdom.org> writes:
>
> This version is rather different, as it ends in:
>
> HARDWARE ERROR
> CPU 0: Machine Check Exception: 7 Bank 3: b40000000000083b
> RIP 10:<ffffffff80446e3e> {pci_conf1_read+0xbe/0xf0}
> TSC 2e7932dbf8 ADDR fdfc000cfc
> This is not a software problem!
> Run through mcelog --ascii to decode and contact your hardware vendor
> Kernel panic - not syncing: Uncorrected machine check
Decoded it gives
..
bus error 'local node origin, request didn't time out
data read mem transaction
i/o access, level generic'
..
It will probably boot with mce=off acpi=off pci=conf1
You got some buggy device that causes a bus timeout when its config space
is read. The old kernel most likely didn't touch it by luck.
Please add the following patch and send the whole log.
This will tell us which device has this problem.
-Andi
diff -u linux-2.6.17-hack/arch/i386/pci/direct.c-o linux-2.6.17-hack/arch/i386/pci/direct.c
--- linux-2.6.17-hack/arch/i386/pci/direct.c-o 2006-04-20 02:17:33.000000000 +0200
+++ linux-2.6.17-hack/arch/i386/pci/direct.c 2006-09-18 09:48:46.000000000 +0200
@@ -19,6 +19,9 @@
{
unsigned long flags;
+ printk("conf1 read bus %x devfn %x reg %x len %u\n",
+ bus, devfn, reg, len);
+
if ((bus > 255) || (devfn > 255) || (reg > 255)) {
*value = -1;
return -EINVAL;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists