[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F39F85DBE@ORSMSX114.amr.corp.intel.com>
Date: Tue, 15 Dec 2015 23:46:03 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Borislav Petkov <bp@...en8.de>
CC: Ingo Molnar <mingo@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>,
"Williams, Dan J" <dan.j.williams@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-nvdimm@...1.01.org" <linux-nvdimm@...1.01.org>,
"x86@...nel.org" <x86@...nel.org>
Subject: RE: [PATCHV2 2/3] x86, ras: Extend machine check recovery code to
annotated ring0 areas
>> + /* Fault was in recoverable area of the kernel */
>> + if ((m.cs & 3) != 3 && worst == MCE_AR_SEVERITY)
>> + if (!fixup_mcexception(regs, m.addr))
>> + mce_panic("Failed kernel mode recovery", &m, NULL);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Does that always imply a failed kernel mode recovery? I don't see
>
> (m.cs == 0 and MCE_AR_SEVERITY)
>
> MCEs always meaning that a recovery should be attempted there. I think
> this should simply say
>
> mce_panic("Fatal machine check on current CPU", &m, msg);
I don't think this can ever happen. If we were in kernel mode and decided
that the severity was AR_SEVERITY ... then search_mcexception_table()
found an entry for the IP where the machine check happened.
The only way for fixup_exception to fail is if search_mcexception_table()
now suddenly doesn't find the entry it found earlier.
But if this "can't happen" thing actually does happen ... I'd like the panic
message to be different from other mce_panic() so you'll know to blame
me.
Applied all the other suggestions.
-Tony
Powered by blists - more mailing lists