[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F329504FD@ORSMSX114.amr.corp.intel.com>
Date: Fri, 21 Nov 2014 21:31:56 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Borislav Petkov <bp@...en8.de>
CC: rui wang <ruiv.wang@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"gong.chen@...ux.intel.com" <gong.chen@...ux.intel.com>,
"Wang, Rui Y" <rui.y.wang@...el.com>
Subject: RE: [PATCH v3] x86/mce: Try printing all machine check banks known
before panic
>
> /*
> * No machine check event found. Must be some external
> * source or one CPU is hung. Panic.
> */
> if (global_worst <= MCE_KEEP_SEVERITY && mca_cfg.tolerant < 3)
> mce_panic("Machine check from unknown source", NULL, NULL);
>
> Provided this comment is correct, it doesn't sound like any MCE record
> will ever tell us what causes the error as an external source or a hung
> CPU doesn't generate an MCE record in any bank, does it?
That means there were no VALID=1, EN=1, S=1 errors anywhere. But there
might be some other things logged that would help us understand.
We are into cpu errata territory here though ... we aren't supposed to get
machine checks that don't have a logged cause. We panic for spurious
machine checks because we know something has gone horribly wrong,
even if we don't know what that something was.
-Tony
Powered by blists - more mailing lists