[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4514539D.8010704@cora.nwra.com>
Date: Fri, 22 Sep 2006 15:20:29 -0600
From: Orion Poplawski <orion@...a.nwra.com>
To: Linus Torvalds <torvalds@...l.org>
CC: linux-kernel@...r.kernel.org
Subject: Re: How to debug random bus errors?
Linus Torvalds wrote:
>
> On Fri, 22 Sep 2006, Orion Poplawski wrote:
>> We're seeing programs die with "bus error" (SIGBUS) randomly on a dual
>> processor Opteron machine. I've run memtest86+ and cpuburn stress tests with
>> no failure. gdb on a core file seems uninteresting. Is there some way to
>> trace the kernel to try to get more insight?
>
> Which kernel?
>
Sorry, 2.6.17-1.2142_FC4smp, so fairly recent.
> Other than that, does it tend to happen to the same particular program? It
> might just be a user space bug that needs a specific set of things to
> happen. Some of those bugs go away when you disable address space
> randomization etc, since they can depend on just pure luck (or rather,
> lack there-of).
So far just one program. Forgot about address space randomization, may
need to look into turning that off to help debugging.
Thanks!
--
Orion Poplawski
System Administrator 303-415-9701 x222
NWRA/CoRA Division FAX: 303-415-9702
3380 Mitchell Lane orion@...a.nwra.com
Boulder, CO 80301 http://www.cora.nwra.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists