lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 8 Feb 2008 12:26:58 -0500
From:	Neil Horman <nhorman@...hat.com>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	Neil Horman <nhorman@...driver.com>, kexec@...ts.infradead.org,
	linux-kernel@...r.kernel.org, mingo@...hat.com,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...e.hu>,
	tglx@...utronix.de
Subject: Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

On Fri, Feb 08, 2008 at 11:45:44AM -0500, Vivek Goyal wrote:
> On Fri, Feb 08, 2008 at 11:14:22AM -0500, Neil Horman wrote:
> > On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote:
> > > 
> > > * Neil Horman <nhorman@...driver.com> wrote:
> > > 
> > > > Ingo noted a few posts down the nmi_exit doesn't actually write to the 
> > > > APIC EOI register, so yeah, I agree, its bogus (and I apologize, I 
> > > > should have checked that more carefully).  Nevertheless, this patch 
> > > > consistently allowed a hangning machine to boot through an Nmi lockup.  
> > > > So I'm forced to wonder whats going on then that this patch helps 
> > > > with.  perhaps its a just a very fragile timing issue, I'll need to 
> > > > look more closely.
> > > 
> > > try a dummy iret, something like:
> > > 
> > >   asm volatile ("pushf; push $1f; iret; 1: \n");
> > > 
> > > to get the CPU out of its 'nested NMI' state. (totally untested)
> > > 
> > > the idea is to push down an iret frame to the kernel stack that will 
> > > just jump to the next instruction and gets it out of the NMI nesting. 
> > > Note: interrupts will/must still be disabled, despite the iret. (the 
> > > ordering of the pushes might be wrong, we might need more than that for 
> > > a valid iret, etc. etc.)
> > > 
> > > 	Ingo
> > 
> > Just tried this experiment and it met with success.  Executing a dummy iret
> > instruction got us to boot the kdump kernel successfully.  
> > 
> 
> Interesting. So that means there is some operation we can't perform when
> we are in NMI handler (Or nested NMIs, I don't know if this is nested NMI
> case ).
> 
> Even if we initiated crash dump in NMI handler, next kernel should unlock
> that state as soon as we enable interrupts in next kernel (iret will be
> called).
> 
> So the only issue here will be if need to put the explicit logic to unlock
> the NMI earlier (Either in crashing kernel after clearing IDT or in
> purgatory code). Anything earlier then that, will be dangerous though, handling
> another NMI while we are already crashed and doing final preparations to jump
> to the new kernel.
> 
> Neil, is it possible to do some serial console debugging to find out
> where exactly we are hanging? Beats me, what's that operation which can
> not be executed while being in NMI handler and makes system to hang. I am
> also curious to know if it is nested NMI case.
> 
I can try, but my last attempts to do so fuond me hung in various places in
purgatory or very early in head.S.  I'll try again though, to see if I can get
some consistency.

Neil


> Thanks
> Vivek
> 
> _______________________________________________
> kexec mailing list
> kexec@...ts.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

-- 
/***************************************************
 *Neil Horman
 *Software Engineer
 *Red Hat, Inc.
 *nhorman@...hat.com
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***************************************************/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ