[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <550ACD41.6040607@redhat.com>
Date: Thu, 19 Mar 2015 14:21:05 +0100
From: Denys Vlasenko <dvlasenk@...hat.com>
To: Andy Lutomirski <luto@...capital.net>
CC: Linus Torvalds <torvalds@...ux-foundation.org>,
Stefan Seyfried <stefan.seyfried@...glemail.com>,
Takashi Iwai <tiwai@...e.de>, X86 ML <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, Tejun Heo <tj@...nel.org>
Subject: Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?
On 03/18/2015 10:55 PM, Andy Lutomirski wrote:
> On Wed, Mar 18, 2015 at 2:42 PM, Denys Vlasenko <dvlasenk@...hat.com> wrote:
>>> in 'irq_return_via_sysret' is new to 4.0, and instead of entering the
>>> kernel with a user stack poiinter, maybe we're *exiting* the kernel,
>>> and have just reloaded the user stack pointer when "USERGS_SYSRET64"
>>> takes some fault.
>>
>> Yes, so far we happily thought that SYSRET never fails...
>>
>> This merits adding some code which would at least BUG_ON
>> if the faulting address is seen to match SYSRET64.
>
> sysret64 can only fail with #GP, and we're totally screwed if that
> happens, although I agree about the BUG_ON in principle. Where would
> we add it that would help in this case, though? We never even made it
> to C code.
I propose to widen such check to catch any cases where
we enter an exception from CPL0 and find that our RSP
is bad. This will cover the case of faulting SYSRET and possible
future obscure bugs.
What this patch does is it stops CPU dead if we find itself
with userspace RSP (not saved RSP, but _actual_ %RSP register)
in an exception handler prologue:
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index a0a3a6e..53a34ba 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -930,6 +930,12 @@ ENTRY(\sym)
INTR_FRAME
.endif
+ testq %rsp,%rsp
+ /* If RSP is positive, we are in kernel but have userspace RSP. */
+ /* We corrupted user stack already by storing iret frame there. */
+ /* This is supposed to be impossible. */
+0: jns 0b
+
ASM_CLAC
PARAVIRT_ADJUST_EXCEPTION_FRAME
Hopefully then NMI watchdog will kill it, and we'll get better data.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists