linux-kernel - Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.11.1503282302130.5791@eddie.linux-mips.org>
Date:	Sat, 28 Mar 2015 23:57:07 +0000 (GMT)
From:	"Maciej W. Rozycki" <macro@...ux-mips.org>
To:	Andy Lutomirski <luto@...capital.net>
cc:	Stefan Seyfried <stefan.seyfried@...glemail.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Takashi Iwai <tiwai@...e.de>,
	Denys Vlasenko <dvlasenk@...hat.com>, X86 ML <x86@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>, Tejun Heo <tj@...nel.org>
Subject: Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm
 related?

On Wed, 18 Mar 2015, Andy Lutomirski wrote:

> > I posted the same problem to the opensuse kernel list shortly before turning
> > to LKML. There, Michal Kubecek noted:
> >
> > "I encountered a similar problem recently. The thing is, x86
> > specification says that on a double fault, RIP and RSP registers are
> > undefined, i.e. you not only can't expect them to contain values
> > corresponding to the first or second fault but you can't even expect
> > them to have any usable values at all. Unfortunately the kernel double
> > fault handler doesn't take this into account and does try to display
> > usual crash related information so that it itself does usually crash
> > when trying to show stack content (that's the show_stack_log_lvl()
> > crash).
> 
> I think that's not entirely true.  RIP is reliable for many classes of
> double faults, and we rely on that for espfix64.  The fact that hpa
> was willing to write that code strongly suggests that Intel chips at
> least really do work that way.

 A #DF won't deliberately clobber the instruction or the stack pointer.  
It's only that it may happen at a stage where either or both original 
pointers have been lost and replaced with new values already, possibly 
making them inconsistent with the corresponding segment selectors too (as 
they are not written at the same time).

 This will only happen in certain degenerate corner cases such as e.g. a 
problem with TSS (#TS) in the processing of a task gate used for taking 
the original exception, where a part of the new context has already been 
loaded before #DF resulted.  Another case will be a stack segment limit 
violation (#SS), where stack has been switched in the processing of a trap 
or interrupt gate, preventing return information and error code from being 
pushed for the original exception.  These are not conditions we'd normally 
observe in Linux.

 In other cases both the original instruction and the original stack 
pointer will have been retained.

  Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/