lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 25 Nov 2017 16:16:23 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Andy Lutomirski <luto@...nel.org>, X86 ML <x86@...nel.org>,
        Borislav Petkov <bpetkov@...e.de>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Brian Gerst <brgerst@...il.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: [PATCH] x86/orc: Don't bail on stack overflow

Can you send me whatever config and exact commit hash generated this?
I can try to figure out why it failed.

On Sat, Nov 25, 2017 at 3:13 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
> On Sat, 25 Nov 2017, Andy Lutomirski wrote:
>
>> On Sat, Nov 25, 2017 at 9:28 AM, Andy Lutomirski <luto@...nel.org> wrote:
>> > If we overflow the stack into a guard page and then try to unwind
>> > it with ORC, it should work perfectly: by construction, there can't
>> > be any meaningful data in the guard page because no writes to the
>> > guard page will have succeeded.
>> >
>> > ORC seems entirely capable of unwinding in this situation, except
>> > that it doesn't even try.  Adjust its initial stack check so that
>> > it's willing to try unwinding.
>> >
>> > I tested this by intentionally overflowing the task stack.  The
>> > result is an accurate call trace instead of a trace consisting
>> > purely of '?' entries.
>> >
>> > Signed-off-by: Andy Lutomirski <luto@...nel.org>
>> > ---
>> >
>> > Hi all-
>> >
>> > Ingo, this would have fixed half the debugging problem you had, I think.
>> > To really nail it, we'd want some kind of magic to annotate the trace
>> > so that page_fault (and async_page_fault) entries show CR2 and error_code.
>> >
>> > Josh, any ideas of how to do that cleanly?  We could easily hard-code it
>> > in the OOPS unwinder, I guess.
>>
>> Actually, this does pretty well.  We don't get CR2, but, when I added
>> an intentional bug kind of along the lines of the one you debugged,
>> the intermediate page fault successfully dumps all the regs in the
>> stack trace, so we get the faulting instruction *and* the registers.
>> We also get ORIG_RAX, which tells us the error code.  We could be
>> fancy and decode that.
>
> It works in general, but for that case it's not much better than before
> vs. the '?' entries.
>
> Thanks,
>
>         tglx
>
> [    2.556065] PANIC: double fault, error_code: 0x0
> [    2.557116] CPU: 1 PID: 273 Comm: systemd-udevd Not tainted 4.14.0-01256-g03dea81fe9f2-dirty #30
> [    2.558930] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [    2.560133] task: ffff880428dd8000 task.stack: ffffc900025fc000
> [    2.560729] RIP: 0010:page_fault+0x11/0x60
> [    2.561122] RSP: 0000:ffffffffff083fc8 EFLAGS: 00010046
> [    2.561607] RAX: 00000000819d0ac7 RBX: 0000000000000001 RCX: ffffffff819d0ac7
> [    2.562357] RDX: 0000000000000000 RSI: 0000000000000010 RDI: ffffffffff084078
> [    2.563027] RBP: 000000000000000b R08: 00000000ffffffff R09: 0000000000000040
> [    2.563726] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000000003
> [    2.564429] R13: 000055719fd7d410 R14: 0000000000000000 R15: 0000000003938700
> [    2.565104] FS:  00007f9edc0b78c0(0000) GS:ffff88042e440000(0000) knlGS:0000000000000000
> [    2.565844] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    2.566396] CR2: ffffffffff083fb8 CR3: 0000000428ec4005 CR4: 00000000001606e0
> [    2.567097] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    2.567761] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [    2.568451] Call Trace:
> [    2.568704]  <SYSENTER>
> [    2.568950]  ? __do_page_fault+0x4b0/0x4b0
> [    2.569348]  ? page_fault+0x2c/0x60
> [    2.569680]  ? native_iret+0x7/0x7
> [    2.570019]  ? __do_page_fault+0x4b0/0x4b0
> [    2.570396]  ? page_fault+0x2c/0x60
> [    2.570743]  ? call_function_interrupt+0xc0/0xc0
> [    2.571199]  </SYSENTER>
> [    2.571422] Code: ff e8 34 b7 6a ff e9 9f 02 00 00 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 83 c4 88 f6 84 24 88 00 00 00 03 75 20 <e8> 4a 01 00 00 48 89 e7 48 8b 74 24 78 48 c7 44 24 78 ff ff ff
> [    2.573192] Kernel panic - not syncing: Machine halted.
> [    2.573694] CPU: 1 PID: 273 Comm: systemd-udevd Not tainted 4.14.0-01256-g03dea81fe9f2-dirty #30
> [    2.574528] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [    2.575330] Call Trace:
> [    2.575570]  <#DF>
> [    2.575760]  dump_stack+0x46/0x59
> [    2.576120]  panic+0xde/0x223
> [    2.576405]  df_debug+0x29/0x30
> [    2.576687]  do_double_fault+0x9a/0x120
> [    2.577057]  double_fault+0x22/0x30
> [    2.577376] RIP: 0010:page_fault+0x11/0x60
> [    2.577775] RSP: 0000:ffffffffff083fc8 EFLAGS: 00010046
> [    2.578314] RAX: 00000000819d0ac7 RBX: 0000000000000001 RCX: ffffffff819d0ac7
> [    2.578979] RDX: 0000000000000000 RSI: 0000000000000010 RDI: ffffffffff084078
> [    2.579666] RBP: 000000000000000b R08: 00000000ffffffff R09: 0000000000000040
> [    2.580334] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000000003
> [    2.581008] R13: 000055719fd7d410 R14: 0000000000000000 R15: 0000000003938700
> [    2.581684]  ? native_iret+0x7/0x7
> [    2.582007] WARNING: can't dereference iret registers at ffffffffff084048 for ip page_fault+0x11/0x60
> [    2.582008]  </#DF>
> [    2.583134]  <SYSENTER>
> [    2.583367]  ? __do_page_fault+0x4b0/0x4b0
> [    2.583751]  ? page_fault+0x2c/0x60
> [    2.584127]  ? native_iret+0x7/0x7
> [    2.584466]  ? __do_page_fault+0x4b0/0x4b0
> [    2.584860]  ? page_fault+0x2c/0x60
> [    2.585195]  ? call_function_interrupt+0xc0/0xc0
> [    2.585621]  </SYSENTER>
> [    2.586966] Dumping ftrace buffer:
> [    2.587254]    (ftrace buffer empty)
> [    2.587534] Kernel Offset: disabled
> [    2.587814] ---[ end Kernel panic - not syncing: Machine halted.
>

Powered by blists - more mailing lists