[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100910160211.GH4879@redhat.com>
Date: Fri, 10 Sep 2010 12:02:11 -0400
From: Don Zickus <dzickus@...hat.com>
To: Huang Ying <ying.huang@...el.com>
Cc: Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
linux-kernel@...r.kernel.org, Andi Kleen <andi@...stfloor.org>
Subject: Re: [RFC 5/6] x86, NMI, Add support to notify hardware error with
unknown NMI
> @@ -349,6 +351,14 @@ io_check_error(unsigned char reason, str
> static notrace __kprobes void
> unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
> {
> + /*
> + * On some platforms, hardware errors may be notified via
> + * unknown NMI
> + */
> + if (unknown_nmi_for_hwerr)
> + panic("NMI for hardware error without error record: "
> + "Not continuing");
> +
> #ifdef CONFIG_MCA
I'm not sure I agree with this. I still see PCI SERR's not coming in
through port 0x61 and get routed to unknown_nmi_error. Not sure we should
just assume that it is an APEI/HEST error and panic the box.
Also all the perf problems we have seen recently have been going through
that path as we slowly try to figure out why we are not catching those
unknown nmis.
I am grasping for straws here, but is there a register that APEI/HEST can
poke to see if it generated the NMI?
Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists