linux-kernel - Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrWOrgnJZN-NDjmqgTfs1iFBqqJevi4A60wKcv7mfF0BJA@mail.gmail.com>
Date:	Wed, 18 Mar 2015 14:17:09 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Denys Vlasenko <dvlasenk@...hat.com>
Cc:	Stefan Seyfried <stefan.seyfried@...glemail.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Takashi Iwai <tiwai@...e.de>, X86 ML <x86@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>, Tejun Heo <tj@...nel.org>
Subject: Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?

On Wed, Mar 18, 2015 at 2:06 PM, Denys Vlasenko <dvlasenk@...hat.com> wrote:
> On 03/18/2015 09:49 PM, Andy Lutomirski wrote:
>> On Wed, Mar 18, 2015 at 1:06 PM, Denys Vlasenko <dvlasenk@...hat.com> wrote:
>>> On 03/18/2015 08:26 PM, Andy Lutomirski wrote:
>>>> Hi Linus-
>>>>
>>>> You seem to enjoy debugging these things.  Want to give this a shot?
>>>> My guess is a vmalloc fault accessing either old_rsp or kernel_stack
>>>> right after swapgs in syscall entry.
>>>
>>> The code is:
>>>
>>> ENTRY(system_call)
>>>         SWAPGS_UNSAFE_STACK
>>> GLOBAL(system_call_after_swapgs)
>>>         movq    %rsp,PER_CPU_VAR(rsp_scratch)
>>>         movq    PER_CPU_VAR(kernel_stack),%rsp
>>>
>>> If PER_CPU_VAR(var) memory access can page fault
>>> (I was thinking this is ensured to never fault),
>>> then on these two instructions such page fault
>>> will be fatal: we will still have userspace %rsp.
>>>
>>> I thought we can only get a NMI or debug interrupt here,
>>> and they are both set up to use IST stacks
>>> to prevent this scenario (among other reasons).
>>
>> I don't think that #DB is possible -- we should never have a
>> watchpoint on percpu memory like that (unless we're using kgdb, in
>> which case I think that kgdb should be fixed).
>
> And #DB shouldn't cause a problem even if it happens (it's on
> an IST stack).
>
> I was thinking about it more and the thing is, CPU did manage
> to enter page fault handler.
>
> It means that it managed to store iret frame.
>
> This means that stores to (%rsp) worked, whatever %rsp is
> (even if it points to user's page).
>
> The double fault happened only when CALL insn inside the handler
> attempted to push yet another word. _This_ is what did not work.
>
> Why?
>
> I almost ready to declare that it's SMAP triggering:
> that attempts to access (write to) userspace were caught.
> However, disassembly shows
>
> crash> disassemble page_fault
> Dump of assembler code for function page_fault:
>    0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
>    0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
>    0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
>    0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
>    0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^KABOOM HERE^^^^^^^^^^^^^^^^^^^^^^^
>    0xffffffff816834b2 <+18>:    mov    %rsp,%rdi
>    0xffffffff816834b5 <+21>:    mov    0x78(%rsp),%rsi
>    0xffffffff816834ba <+26>:    movq   $0xffffffffffffffff,0x78(%rsp)
>    0xffffffff816834c3 <+35>:    callq  0xffffffff810504e0 <do_page_fault>
>    0xffffffff816834c8 <+40>:    jmpq   0xffffffff816836d0 <error_exit>
> End of assembler dump.
>
> Those NOPs at the beginning are ASM_CLAC and PARAVIRT_ADJUST_EXCEPTION_FRAME
> from this source:
>
>
> .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
> ENTRY(\sym)
>         /* Sanity check */
>         .if \shift_ist != -1 && \paranoid == 0
>         .error "using shift_ist requires paranoid=1"
>         .endif
>
>         .if \has_error_code
>         XCPT_FRAME
>         .else
>         INTR_FRAME
>         .endif
>
>         ASM_CLAC
>         PARAVIRT_ADJUST_EXCEPTION_FRAME
>
>         subq $ORIG_RAX-R15, %rsp
>         call error_entry
>         ...
>
> If ASM_CLAC is replaced by NOPs, this CPU must be not SMAP capable.
> If so, then another store to (%rsp) should have worked too...
>
>
> Stefan, Takashi - are you seeing this on SMAP-capable CPUs?

That's why I asked if this was Broadwell.  It's not :(

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/