[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070130032734.GA28701@Krystal>
Date: Mon, 29 Jan 2007 22:27:35 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: "Martin J. Bligh" <mbligh@...igh.org>
Cc: linux-kernel@...r.kernel.org, Andrew Morton <akpm@...l.org>,
Ingo Molnar <mingo@...e.hu>
Subject: Re: Bug report : reproducible memory bug (hardware failure, sorry)
* Martin J. Bligh (mbligh@...igh.org) wrote:
> Mathieu Desnoyers wrote:
> >Hi,
> >
> >Trying to build cross-compilers (or kernels) on a 2-way x86_64 (amd64) with
> >make -j3 triggers the following OOPS after about 30 minutes on
> >2.6.19.2. Due to the amount of time and the heavy load it takes before it
> >happens, I suspect a race condition. Memtest86 tests passed ok. The
> >amount of swap used when the condition happens is about 52k and stable
> >(only ~800MB/1GB are used).
> >
> >I am going to give it a look, but I suspect you might help narrowing it
> >down more quickly. Any insight would be appreciated.
>
> Mmm. that's going to be messy to debug ... but didn't we already know
> that kernel was racy? Or is 2.6.19.2 after that fix already? Does 20-rc6
> still break?
Hi Martin,
I finally re-ran memtest86 on the machine since it began to have too
many different kind of errors (GPF, invalid instruction...). It turned
out that one of the memory modules was bad. I guess my brand new
list_debug race condition debugger will be useful in the future, but not
now. :)
I'll remember to let memtest86 run a few hours more on my new machines
next time.
Mathieu
--
OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg
Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists