lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170307160748.z4dl7wdqusiybjda@redhat.com>
Date:   Tue, 7 Mar 2017 11:07:48 -0500
From:   Don Zickus <dzickus@...hat.com>
To:     Mike Travis <mike.travis@....com>
Cc:     Ingo Molnar <mingo@...nel.org>, Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Dimitri Sivanich <dimitri.sivanich@....com>,
        Frank Ramsay <frank.ramsay@....com>,
        Russ Anderson <russ.anderson@....com>,
        Tony Ernst <tony.ernst@....com>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI
 call chain

On Tue, Mar 07, 2017 at 08:00:33AM -0800, Mike Travis wrote:
> 
> 
> On 3/7/2017 7:22 AM, Don Zickus wrote:
> > On Tue, Mar 07, 2017 at 08:42:10AM +0100, Ingo Molnar wrote:
> >>
> >> * Mike Travis <mike.travis@....com> wrote:
> >>
> >>> Add a new NMI call chain that is called last after all other NMI handlers
> >>> have been checked and did not "handle" the NMI.  This mimics the current
> >>> NMI_UNKNOWN call chain except it eliminates the WARNING message about
> >>> multiple NMI handlers registering on this call chain.
> >>>
> >>> This call chain dramatically lowers the NMI call frequency when high
> >>> frequency NMI tools are in use, notably the perf tools.  It is required
> >>> for NMI handlers that cannot sustain a high NMI call rate without
> >>> ramifications to the system operability.
> >>
> >> So how about we just turn off that warning instead? I don't remember the last time 
> >> it actually _helped_ us find any kernel or hardware bug - and it has caused tons 
> >> of problems...
> > 
> > Yeah, that is one way to solve it. :-)
> 
> Actually just removing the WARNING indication and making it an
> INFO message would be enough to quiet objections.  Is that enough,

That might be useful to at least log in, but as Ingo said it just might be
overkill in the real world.

> or should the message be completely removed for UNKNOWN NMI
> handlers, and left in place for IO_CHECK and SERR NMI handlers?
> 
> > 
> > Originally I put that in there because the unknown nmi handlers sometime do
> > not return, making it impossible for the second handler to run.
> 
> The only two external unknown NMI handlers that I know of is the UV
> one and the KGDB one.  The KGDB one appears to be only claimed if it
> is exiting an NMI_LOCAL event.  And the UV one is only claimed if
> it as caused by a UV System NMI event.  So truly unknown NMI events
> eventually get to the internal unknown nmi handler.

Good to know.  I keep thinking of the hpwdt that wants to eat all NMIs, I
believe that still registers on the unknown nmi (drivers/watchdog/hpwdt.c).

Cheers,
Don

> 
> > But you are right, it probably hasn't really helped find any problems.  I
> > wasn't aware of problems it was causing (not that I was looking through
> > emails to find them either).
> > 
> > Cheers,
> > Don
> > 
> >>
> >> It's not like we warn about excess regular IRQs either - we either handle them or 
> >> at most increase a counter somewhere. We could do the same for NMIs: introduce a 
> >> counter somewhere that counts the number of seemingly unhandled NMIs.
> >>
> >> But in any case, we should not spam the kernel log, neither with high, nor with 
> >> low frequency.
> >>
> >> Thanks,
> >>
> >> 	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ