[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4DD07959.4030608@intel.com>
Date: Mon, 16 May 2011 09:09:45 +0800
From: Huang Ying <ying.huang@...el.com>
To: Cyrill Gorcunov <gorcunov@...il.com>
CC: huang ying <huang.ying.caritas@...il.com>,
Ingo Molnar <mingo@...e.hu>, Don Zickus <dzickus@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Andi Kleen <andi@...stfloor.org>,
Robert Richter <robert.richter@....com>,
Andi Kleen <ak@...ux.intel.com>
Subject: Re: [RFC] x86, NMI, Treat unknown NMI as hardware error
On 05/15/2011 02:34 PM, Cyrill Gorcunov wrote:
> On 05/15/2011 04:06 AM, huang ying wrote:
> ...
>>>
>>> yes, is not good. But at least we *must* provide a way to turn this new feature off
>>> via command line I think. One of a reason for me is perf unknown nmis (at moment we seems
>>> to have captured and cured all parasite NMIs sources but there is no guarantee we wont
>>> meet them in future due to some code change or whatever). And bloating trap.c with
>>> new if()'s is not that good I guess, that is why I asked if there a way to do all the
>>> work via notifiers ;)
>>
>> Yes. We should consider about perf unknown NMI issues. But compared
>> with pushing all magic to user, I think the better way is to have a
>> better default behavior in kernel. For example, we can turn off
>> unknown NMI as hwerr logic temporarily if there are more than 1 perf
>> NMI events in action. Is that reasonable?
>
> I'm personally fine even if it's enabled by default, only worried to have
> an option to disable hwerr from boot line.
The white list mechanism is not sufficient? Spurious unknown NMI can
occur on white list machines? People don't want to protect their data?
>> And, I am not a big fan of notifiers, that makes code hard to be
>> understood. If you have concerns about the size of traps.c, we can
>> move all NMI logic to a new file.
>
> Ying, the concern is rather related to the code scheme in general. Since
> we have notifiers I think the better way to be consistent here and use
> hwerr notifier too. But it's IMHO ;)
As for go notifiers or not. IMHO, a rule can be:
- If it is something like a driver, than it should go notifier
- If it is architectural/PC defacto standard, it can sit outside of
notifier.
I think that seeing unknown NMI as hardware error should be part of PC
defacto standard. Do you think so?
Best Regards,
Huang Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists