linux-kernel - Re: [PATCH -v2 7/7] x86, NMI, Remove do_nmi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AANLkTinuNKfTBYTK1+Ks-QSPbQ3qz5cCZC6vUwUZ+zCh@mail.gmail.com>
Date:	Wed, 29 Sep 2010 14:55:58 +0800
From:	huang ying <huang.ying.caritas@...il.com>
To:	Don Zickus <dzickus@...hat.com>
Cc:	Huang Ying <ying.huang@...el.com>,
	Robert Richter <robert.richter@....com>,
	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Andi Kleen <andi@...stfloor.org>
Subject: Re: [PATCH -v2 7/7] x86, NMI, Remove do_nmi_callback logic

Hi, Don,

On Tue, Sep 28, 2010 at 11:19 PM, Don Zickus <dzickus@...hat.com> wrote:
>> If NMI comes from watchdog, nmi_watchdog_tick() will return 1. So
>> do_nmi_callback() is NOT for watchdog NMI, but for unknown NMI. Why do
>> we call DIE_NMIWATCHDOG for unknown NMI (NOT watchdog NMI)? die_nmi is
>> for watchdog, not unknown NMI.
>
> I think watchdog is an overloaded term.  I was under the impression that
> once the nmi watchdog determined a problem, it called the DIE_NMIWATCHDOG
> die chain to see if any other drivers wanted to clean up or do their thing
> first before panic'ing (namely drivers in drivers/char/watchdog).

Yes. I think so too. And in original code, almost all DIE_NMIxxx is
used in this way:

DIE_NMI is called after read port 0x61, to see if any other driver
wanted to recover the error notified based on reason read from port
0x61.

DIE_NMIWATCHDOG is used to see if any other drivers wanted to clean up
or do their thing before panic

DIE_NMIUNKNOWN is used to see if any other driver wanted to clean up
or debug before default unknown logic (such as panic).

DIE_NMI_IPI is used to see if any driver want to process the NMI (sent
via APIC? Maybe named after that).

So the original implementation of defualt_do_nmi() is:

- determine the reason/source of NMI in default_do_nmi(). Although the
exact reason/source is not determined, such as perf.

- notify_die() for corresponding NMI reason/source, to see if any
driver want to process this instead of the default operation

- If no other driver processed it, call default operation, such as
panic for DIE_NMIUNKNOWN.

The original implementation need to be changed, because it only uses
port 0x61 to determine the reason/source of NMI. We need a order based
scheme to determine the reason/source of NMI. The order is as follow:

CPU-specific (CPU local) NMI
non-CPU-specific (global) NMI
port 0x61
NMI Watchdog

I think we all agree that to use order to determine the reason/source
of NMI. The difference is that I want to keep as many direct calls in
default_do_nmi() as possible, while you guys want to wrap almost all
code in default_do_nmi() into notifier handler and leave only one
notify_die() in defualt_do_nmi(). And I want to use different die_val
(and their calling order in default_do_nmi()) to determine the order
while you guys want to use priority (based on its value) to determine
the order.

On the other hand, I think we should call corresponding DIE_NMIxxx
before the default operations, such as for watchdog, call
DIE_NMIWATCHDOG before go panic, for unknown nmi, call DIE_NMIUNKNOWN
before the default processing (may panic).

I think it is important to distinguish between die chain used to
determine the source/reason of NMI and the die chain used to see if
any other driver wanted to do some processing before the default
operation.

Best Regards,
Huang Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/