[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BANLkTin1pvdnXtw53MhwfCsEQpsu0TUUmg@mail.gmail.com>
Date: Sun, 15 May 2011 08:06:30 +0800
From: huang ying <huang.ying.caritas@...il.com>
To: Cyrill Gorcunov <gorcunov@...il.com>
Cc: Huang Ying <ying.huang@...el.com>, Ingo Molnar <mingo@...e.hu>,
Don Zickus <dzickus@...hat.com>, linux-kernel@...r.kernel.org,
Andi Kleen <andi@...stfloor.org>,
Robert Richter <robert.richter@....com>,
Andi Kleen <ak@...ux.intel.com>
Subject: Re: [RFC] x86, NMI, Treat unknown NMI as hardware error
On Sat, May 14, 2011 at 3:51 PM, Cyrill Gorcunov <gorcunov@...il.com> wrote:
> On 05/14/2011 04:26 AM, huang ying wrote:
>> On Fri, May 13, 2011 at 11:17 PM, Cyrill Gorcunov <gorcunov@...il.com> wrote:
>>> Hi Ying,
>>>
>>> just curious (regardless the concerns Don and Ingo have) -- if there still a need
>>> for such semi-unknown nmi handling maybe it's worth to register a *notifier* for it
>>> and we panic only when user *explicitly* specify how to treat this class of NMIs
>>> (via say "hest-nmi-panic" boot option or something like that). Maybe such partially
>>> modular scheme would be better? If only I don't miss anything.
>>
>> Hi, Cyrill,
>>
>> IMHO, Pushing all policy to user is not good too. How many users
>> understand unknown NMI and hardware error clearly? It is better if we
>> can determine what is the right behavior.
>>
>
> yes, is not good. But at least we *must* provide a way to turn this new feature off
> via command line I think. One of a reason for me is perf unknown nmis (at moment we seems
> to have captured and cured all parasite NMIs sources but there is no guarantee we wont
> meet them in future due to some code change or whatever). And bloating trap.c with
> new if()'s is not that good I guess, that is why I asked if there a way to do all the
> work via notifiers ;)
Yes. We should consider about perf unknown NMI issues. But compared
with pushing all magic to user, I think the better way is to have a
better default behavior in kernel. For example, we can turn off
unknown NMI as hwerr logic temporarily if there are more than 1 perf
NMI events in action. Is that reasonable?
And, I am not a big fan of notifiers, that makes code hard to be
understood. If you have concerns about the size of traps.c, we can
move all NMI logic to a new file.
Best Regards,
Huang Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists