[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080904205345.GO18288@one.firstfloor.org>
Date: Thu, 4 Sep 2008 22:53:45 +0200
From: Andi Kleen <andi@...stfloor.org>
To: "Mingarelli, Thomas" <Thomas.Mingarelli@...com>
Cc: Andi Kleen <andi@...stfloor.org>, Vivek Goyal <vgoyal@...hat.com>,
Don Zickus <dzickus@...hat.com>, Ingo Molnar <mingo@...e.hu>,
Prarit Bhargava <prarit@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"arozansk@...hat.com" <arozansk@...hat.com>,
"ak@...ux.intel.com" <ak@...ux.intel.com>,
Alan Cox <alan@...rguk.ukuu.org.uk>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
"Maciej W. Rozycki" <macro@...ux-mips.org>
Subject: Re: [PATCH RFC] NMI Re-introduce un[set]_nmi_callback
On Thu, Sep 04, 2008 at 08:21:40PM +0000, Mingarelli, Thomas wrote:
> The BIOS does the actual logging of the cause of the NMI. What kind of NMI:
>
> PCI Bus Parity error
> Double bit memory error
> .
> .
> .
> And so on.
>
> The watchdog is a separate part of the driver. It can be enabled or not; most of our customers will want the NMI sourcing capability of the driver.
> With Prarit's patch we no longer need to worry about the watchdog timer firing. However, yes that was troublesome before his patch. We could not distinguish between a REAL NMI and a watchdog timer tick.
>
> The BIOS does not come into play until the hpwdt nmi handler gets called.
If you use the die chain (as you should) you'll get notified
of the NMis, but your handler has to decide if the NMI is for
you or not. The way the chain works is that it asks everyone
(in priority order) and the first one who says "it's for me"
will get it.
So if your handler can decide "This is an NMI that came from
a source i know about" it can be a proper[1] NMI cititzen.
Otherwise it will be hard to make it coexist nicely.
Then if the BIOS tells you the real cause you should also take
over the final handler anyways because as it was pointed out earlier
the Linux default fallback handler is crap (it made sense on a
IBM PC-AT but not today). If you can ask the BIOS for the real
reason you could printk that instead and everyone will be more
happy. That would be one of those "NMI chipset drivers" I talked
about earlier.
That probably should be a separate driver because it's orthogonal to
the watchdog.
But it should only take
the default handler when kdump is not active.
-Andi
[1] proper defined as in "it's still racy, but on low volume
NMIs it will hopefully DTRT"
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists