[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 3 Mar 2015 18:09:27 +0000
From: "Luck, Tony" <tony.luck@...el.com>
To: Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
Borislav Petkov <bp@...en8.de>
CC: Prarit Bhargava <prarit@...hat.com>,
Vivek Goyal <vgoyal@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Junichi Nomura <j-nomura@...jp.nec.com>,
Kiyoshi Ueda <k-ueda@...jp.nec.com>
Subject: RE: [PATCH v3 1/2] x86: mce: kexec: switch MCE handler for
kexec/kdump
+static void machine_check_under_kdump(struct pt_regs *regs, long error_code)
+{
+ if (mca_cfg.kdump_cpu == smp_processor_id())
+ pr_emerg("MCE triggered when kdumping. If you are lucky enough, you will have a kdump. Otherwise, this is a dying message.\n");
I'm worried about the SRAR case here. Your code just returns, which will trigger the same machine check again. The system will spin forever printing this message.
I think you have to look at MCG_STATUS and scan the machine check banks to make a choice. There are some simple cases:
MCG_STATUS.RIPV=0 -> cannot return (where will the cpu go - you have no idea!)
SRAO -> safe to just return
SRAR -> should not return
But the rest may require some thought. If there is a PCC=1 error, then you may end up with a corrupt dump. Perhaps this case will already be covered by RPIV==0?
-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists