[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <loom.20110217T003826-785@post.gmane.org>
Date: Thu, 17 Feb 2011 00:17:27 +0000 (UTC)
From: Ryan Underwood <ryan.underwood@...ghtsafety.com>
To: linux-kernel@...r.kernel.org
Subject: Re: 2.6.38-rc2: Uhhuh. NMI received for unknown reason 2d on CPU 0.
Preeti Khurana <Preeti.Khurana <at> guavus.com> writes:
>
> I am getting the similar issue as reported
> in https://lkml.org/lkml/2011/2/10/187
>
> Can someone tell me if the same issue because I am getting the
> problem on Intel Xeon..
>
I am seeing exactly the same problem (on 2.6.35 as Preeti reported originally)
on some Xeon servers but only with recently shipped BIOS revisions. The OS is
CentOS 5.5.
In my cases, the system sometimes hangs with no comment, sometimes with a NMI
message immediately before hanging and sometimes with a long trail of
backtrace originating at cpu_idle(). The NMI reason code is different but
in my observation it is usually 21 or 31.
The problem seems to be triggered by accessing a PCI card (via MMIO) because
until accessing the PCI card, the system will run forever with no problems.
Other servers of exactly the same model (Intel SR2500) but older BIOS revision
are working (working is 3/14/2008, non working is 3/9/2010). All software is
identical in these cases.
Also, in one instance, kernel v2.6.18 is used on these servers with the
3/14/2008 BIOS revision without a problem. The rest of the software is again
the same (except for kernel and drivers).
It seems to be a problem with newer kernels combined with the newer Intel BIOS.
I have not tried an older kernel on the newer BIOS yet.
I have not tried the following patches yet which seem to both be for spurious
NMI messages, not accompanied by system lockups:
https://lkml.org/lkml/2011/2/16/106
https://lkml.org/lkml/2011/2/1/286
Both nmi_watchdog=0 and pcie_aspm=off options do not solve the problem.
I am not subscribed so please Cc me.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists