lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120327160601.GA19273@redhat.com>
Date:	Tue, 27 Mar 2012 12:06:01 -0400
From:	Don Zickus <dzickus@...hat.com>
To:	"Andrei E. Warkentin" <andrey.warkentin@...il.com>
Cc:	linux-kernel@...r.kernel.org, kgdb-bugreport@...ts.sourceforge.net,
	jason.wessel@...driver.com
Subject: Re: [PATCH] x86 NMI: Be smarter about invoking panic() inside NMI
 handler.

On Tue, Mar 20, 2012 at 01:57:41PM -0400, Andrei E. Warkentin wrote:
> Hi,
> 
> 2012/3/1 Andrei Warkentin <andrey.warkentin@...il.com>:
> > If two (or more) unknown NMIs arrive on different CPUs, there
> > is a large chance both CPUs will wind up inside panic(). This
> > is fine, unless you want to enter KDB - KDB cannot round up
> > all CPUs, because some of them are stuck inside
> > panic_smp_self_stop with NMI latched. This is
> > easy to replicate with QEMU. Boot with -smp 4 and
> > send NMI using the monitor.
> >
> > Solution for this - attempt to enter panic() from NMI
> > handler. If panic() is already active in the system,
> > just exit out of the NMI handler. This lets KDB round
> > up CPUs.
> >
> > Signed-off-by: Andrei Warkentin <andrey.warkentin@...il.com>
> > ---
> 
> Any feedback on this? Who are the right maintainers to bug about this?

Hmm, if try_panic fails, then the cpu continues on executing code.  This
might further corrupt an already broken system.  So I don't think this
patch will work as is.

Perhaps instead of panic'ing in the NMI context, we use irq_work and panic
in an interrupt context instead.  We still get the system to stop (though
it might still execute some interrupts) and it will be out of the NMI
context.

However, you will still run into a similar problem when in the
panic/reboot case we shutdown all the remote cpus and have them sitting in
a similar cpu_relax loop in the NMI context, while the panic'ing cpu
cleans things up.

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ