linux-kernel - Re: [PATCH RFC] NMI Re-introduce un[set]_nmi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080905085736.GD18485@elte.hu>
Date:	Fri, 5 Sep 2008 10:57:36 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	Don Zickus <dzickus@...hat.com>,
	"Mingarelli, Thomas" <Thomas.Mingarelli@...com>,
	Andi Kleen <andi@...stfloor.org>,
	Prarit Bhargava <prarit@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"arozansk@...hat.com" <arozansk@...hat.com>,
	"ak@...ux.intel.com" <ak@...ux.intel.com>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	"H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"Maciej W. Rozycki" <macro@...ux-mips.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>
Subject: Re: [PATCH RFC] NMI Re-introduce un[set]_nmi_callback


* Vivek Goyal <vgoyal@...hat.com> wrote:

> > Prarit's patch disabled the timer upon registering a callback to 
> > prevent this case.  The thought was if you have your own handler you 
> > could provide your own watchdog.
> 
> In theory, you could do the same while registering the handler on die 
> chain?
> 
> I don't get the whole point. So we are looking for a system where no 
> body else uses an NMI for any purpose and the moment NMI happens, this 
> driver will go and panic() the system. I don't get, what do we achive 
> by that?
> 
> Looks like you got some device in platform which raises an NMI and 
> which indicates that something is wrong and log the message and do a 
> panic(). But you don't have a way to find out if that device has 
> raised the interrupt or something else has raised the NMI, hence you 
> want to stop the watchdog timer and assume any NMI henceforth is from 
> device?
> 
> So any functionality which is dependent on NMI, will not work as long 
> as this driver is loaded.

exactly. The proposed patch brings the NMI notifier arms race to its 
next level: it is in essence a "super high priority" notifier that wants 
exclusivity. That is unnecessary: we already have sufficient 
infrastructure in place to recognize the priority of notifiers, and die 
notifiers make active use of it. (see the list below)

If the goal is to log relevant NMI events to NVRAM for later inspection 
then the solution is to register a low-priority notifier which will 
execute if no other notifier shows interest in an NMI event.

If the goal is to reboot the system if there's a "bad" NMI, then the 
solution is to register a low-priority notifier which will execute and 
panic the system if no other notifier shows interest in that NMI event.

This means all the 'good' NMI sources need to register at higher 
priority and all the standard platform fallback notifiers need to 
register at lowest priority. Debuggers go last. If any of the existing 
notifiers in the kernel dont follow the rules and cause problems then we 
can fix up those.

Here's a list of all current die notifiers we use in the kernel:

                                                                  prio
                                                                  ----
  # core kernel:

  kernel/kprobes.c                   # instruction probing        +INT_MAX
  kernel/trace/trace.c               # dump trace                 150

  # arch/x86:

  arch/x86/mm/kmmio.c                # mmiotrace, debug feature   0 
  arch/x86/kernel/kgdb.c             # kgdb, kernel debugger      -INT_MAX
  arch/x86/kernel/crash.c            # kdump                      0
  arch/x86/oprofile/nmi_int.c        # oprofile                   0
  arch/x86/oprofile/nmi_timer_int.c  # oprofile timer fallback    0

  # drivers:

  drivers/watchdog/hpwdt.c           # HP watchdog driver         +INT_MAX
  drivers/char/ipmi/ipmi_watchdog.c  # IPMI watchdog              150
  drivers/misc/sgi-xp/xpc_main.c     # SGI SN2                    0

  # upcoming:
  arch/x86/mm/kmemcheck/smp.c        # in tip/master              0
                                     [ this should change to +INT_MAX ]

One change we could do is to move arch/x86/kernel/crash.c's priority to 
-100. That would achieve the following effect: it would execute after 
all known sources of NMIs, but before the lowest prio interactive 
notifiers such as kgdb.

That way a platform driver could insert at priority -50 and execute 
after the known self-generated NMI sources, but before any of the debug 
and crash-tool notifiers. And the constants should be made symbolic as 
well, so that we encode the purpose of a notifier, not the 
implementation - and can map the purpose to priority and shape the 
ordering flexibly.

furthermore, as i indicated it in my first reply, if the port 61H read 
is causing problems on this platform (and i'd not be surprised if it did 
so) we can turn it into a lowest-prio fallback notifier which only 
executes if no known sources of NMIs show interest in an event.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/