linux-kernel - Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5509E8C5.2070309@redhat.com>
Date:	Wed, 18 Mar 2015 22:06:13 +0100
From:	Denys Vlasenko <dvlasenk@...hat.com>
To:	Andy Lutomirski <luto@...capital.net>
CC:	Stefan Seyfried <stefan.seyfried@...glemail.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Takashi Iwai <tiwai@...e.de>, X86 ML <x86@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>, Tejun Heo <tj@...nel.org>
Subject: Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?

On 03/18/2015 09:49 PM, Andy Lutomirski wrote:
> On Wed, Mar 18, 2015 at 1:06 PM, Denys Vlasenko <dvlasenk@...hat.com> wrote:
>> On 03/18/2015 08:26 PM, Andy Lutomirski wrote:
>>> Hi Linus-
>>>
>>> You seem to enjoy debugging these things.  Want to give this a shot?
>>> My guess is a vmalloc fault accessing either old_rsp or kernel_stack
>>> right after swapgs in syscall entry.
>>
>> The code is:
>>
>> ENTRY(system_call)
>>         SWAPGS_UNSAFE_STACK
>> GLOBAL(system_call_after_swapgs)
>>         movq    %rsp,PER_CPU_VAR(rsp_scratch)
>>         movq    PER_CPU_VAR(kernel_stack),%rsp
>>
>> If PER_CPU_VAR(var) memory access can page fault
>> (I was thinking this is ensured to never fault),
>> then on these two instructions such page fault
>> will be fatal: we will still have userspace %rsp.
>>
>> I thought we can only get a NMI or debug interrupt here,
>> and they are both set up to use IST stacks
>> to prevent this scenario (among other reasons).
> 
> I don't think that #DB is possible -- we should never have a
> watchpoint on percpu memory like that (unless we're using kgdb, in
> which case I think that kgdb should be fixed).

And #DB shouldn't cause a problem even if it happens (it's on
an IST stack).

I was thinking about it more and the thing is, CPU did manage
to enter page fault handler.

It means that it managed to store iret frame.

This means that stores to (%rsp) worked, whatever %rsp is
(even if it points to user's page).

The double fault happened only when CALL insn inside the handler
attempted to push yet another word. _This_ is what did not work.

Why?

I almost ready to declare that it's SMAP triggering:
that attempts to access (write to) userspace were caught.
However, disassembly shows

crash> disassemble page_fault
Dump of assembler code for function page_fault:
   0xffffffff816834a0 <+0>:     data32 xchg %ax,%ax
   0xffffffff816834a3 <+3>:     data32 xchg %ax,%ax
   0xffffffff816834a6 <+6>:     data32 xchg %ax,%ax
   0xffffffff816834a9 <+9>:     sub    $0x78,%rsp
   0xffffffff816834ad <+13>:    callq  0xffffffff81683620 <error_entry>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^KABOOM HERE^^^^^^^^^^^^^^^^^^^^^^^
   0xffffffff816834b2 <+18>:    mov    %rsp,%rdi
   0xffffffff816834b5 <+21>:    mov    0x78(%rsp),%rsi
   0xffffffff816834ba <+26>:    movq   $0xffffffffffffffff,0x78(%rsp)
   0xffffffff816834c3 <+35>:    callq  0xffffffff810504e0 <do_page_fault>
   0xffffffff816834c8 <+40>:    jmpq   0xffffffff816836d0 <error_exit>
End of assembler dump.

Those NOPs at the beginning are ASM_CLAC and PARAVIRT_ADJUST_EXCEPTION_FRAME
from this source:

.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
ENTRY(\sym)
        /* Sanity check */
        .if \shift_ist != -1 && \paranoid == 0
        .error "using shift_ist requires paranoid=1"
        .endif

        .if \has_error_code
        XCPT_FRAME
        .else
        INTR_FRAME
        .endif

        ASM_CLAC
        PARAVIRT_ADJUST_EXCEPTION_FRAME

        subq $ORIG_RAX-R15, %rsp
        call error_entry
        ...

If ASM_CLAC is replaced by NOPs, this CPU must be not SMAP capable.
If so, then another store to (%rsp) should have worked too...

Stefan, Takashi - are you seeing this on SMAP-capable CPUs?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/