[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+8MBbKXqAw_01jWNmzonPmoER==gJeK1BnyMq5eLqqFMx2NTQ@mail.gmail.com>
Date: Fri, 21 Mar 2014 13:37:12 -0700
From: Tony Luck <tony.luck@...il.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Matthias Graf <matthias.graf@...ovgu.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: PROBLEM: Fatal Machine Check >= 3.13.5-101.fc19.x86_64
On Fri, Mar 21, 2014 at 1:13 PM, Borislav Petkov <bp@...en8.de> wrote:
> Provided the decode is correct and I'm reading it right, this looks
> like the cores get to livelock for some reason without any forward
> progress. The MCEs signal that there hasn't been any instruction retired
> in relatively long time, thus a stall.
Agreed. There are some bus level errors (low 16 bits of STATUS 0x0800)
and some timeout (low bits 0x0400)
> You say, this happens when gnome starts. Does it also happen if you
> don't start gnome, i.e. don't start X at all? Try booting into a
> runlevel without graphics.
>
> Tony, any other ideas?
My best guess is graphics? driver making wild access to some i/o regs that
never respond. If booting without graphics works, then that adds some
weight to the theory.
Other useful tests would be to check upstream kernels 3.12, 3.13 to
see if something is odd in the Fedora additions. And 3.14-rc7 to see
if it is already fixed upstream.
If upstream 3.12 works and 3.13 breaks (and not fixed in 3.14-rc7) ...
then bisecting between 3.12 and 3.13 would be helpful.
-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists