[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1711260009290.2316@nanos>
Date: Sun, 26 Nov 2017 00:13:12 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Andy Lutomirski <luto@...nel.org>
cc: X86 ML <x86@...nel.org>, Borislav Petkov <bpetkov@...e.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Brian Gerst <brgerst@...il.com>,
Dave Hansen <dave.hansen@...el.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: [PATCH] x86/orc: Don't bail on stack overflow
On Sat, 25 Nov 2017, Andy Lutomirski wrote:
> On Sat, Nov 25, 2017 at 9:28 AM, Andy Lutomirski <luto@...nel.org> wrote:
> > If we overflow the stack into a guard page and then try to unwind
> > it with ORC, it should work perfectly: by construction, there can't
> > be any meaningful data in the guard page because no writes to the
> > guard page will have succeeded.
> >
> > ORC seems entirely capable of unwinding in this situation, except
> > that it doesn't even try. Adjust its initial stack check so that
> > it's willing to try unwinding.
> >
> > I tested this by intentionally overflowing the task stack. The
> > result is an accurate call trace instead of a trace consisting
> > purely of '?' entries.
> >
> > Signed-off-by: Andy Lutomirski <luto@...nel.org>
> > ---
> >
> > Hi all-
> >
> > Ingo, this would have fixed half the debugging problem you had, I think.
> > To really nail it, we'd want some kind of magic to annotate the trace
> > so that page_fault (and async_page_fault) entries show CR2 and error_code.
> >
> > Josh, any ideas of how to do that cleanly? We could easily hard-code it
> > in the OOPS unwinder, I guess.
>
> Actually, this does pretty well. We don't get CR2, but, when I added
> an intentional bug kind of along the lines of the one you debugged,
> the intermediate page fault successfully dumps all the regs in the
> stack trace, so we get the faulting instruction *and* the registers.
> We also get ORIG_RAX, which tells us the error code. We could be
> fancy and decode that.
It works in general, but for that case it's not much better than before
vs. the '?' entries.
Thanks,
tglx
[ 2.556065] PANIC: double fault, error_code: 0x0
[ 2.557116] CPU: 1 PID: 273 Comm: systemd-udevd Not tainted 4.14.0-01256-g03dea81fe9f2-dirty #30
[ 2.558930] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 2.560133] task: ffff880428dd8000 task.stack: ffffc900025fc000
[ 2.560729] RIP: 0010:page_fault+0x11/0x60
[ 2.561122] RSP: 0000:ffffffffff083fc8 EFLAGS: 00010046
[ 2.561607] RAX: 00000000819d0ac7 RBX: 0000000000000001 RCX: ffffffff819d0ac7
[ 2.562357] RDX: 0000000000000000 RSI: 0000000000000010 RDI: ffffffffff084078
[ 2.563027] RBP: 000000000000000b R08: 00000000ffffffff R09: 0000000000000040
[ 2.563726] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000000003
[ 2.564429] R13: 000055719fd7d410 R14: 0000000000000000 R15: 0000000003938700
[ 2.565104] FS: 00007f9edc0b78c0(0000) GS:ffff88042e440000(0000) knlGS:0000000000000000
[ 2.565844] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2.566396] CR2: ffffffffff083fb8 CR3: 0000000428ec4005 CR4: 00000000001606e0
[ 2.567097] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2.567761] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2.568451] Call Trace:
[ 2.568704] <SYSENTER>
[ 2.568950] ? __do_page_fault+0x4b0/0x4b0
[ 2.569348] ? page_fault+0x2c/0x60
[ 2.569680] ? native_iret+0x7/0x7
[ 2.570019] ? __do_page_fault+0x4b0/0x4b0
[ 2.570396] ? page_fault+0x2c/0x60
[ 2.570743] ? call_function_interrupt+0xc0/0xc0
[ 2.571199] </SYSENTER>
[ 2.571422] Code: ff e8 34 b7 6a ff e9 9f 02 00 00 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 83 c4 88 f6 84 24 88 00 00 00 03 75 20 <e8> 4a 01 00 00 48 89 e7 48 8b 74 24 78 48 c7 44 24 78 ff ff ff
[ 2.573192] Kernel panic - not syncing: Machine halted.
[ 2.573694] CPU: 1 PID: 273 Comm: systemd-udevd Not tainted 4.14.0-01256-g03dea81fe9f2-dirty #30
[ 2.574528] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 2.575330] Call Trace:
[ 2.575570] <#DF>
[ 2.575760] dump_stack+0x46/0x59
[ 2.576120] panic+0xde/0x223
[ 2.576405] df_debug+0x29/0x30
[ 2.576687] do_double_fault+0x9a/0x120
[ 2.577057] double_fault+0x22/0x30
[ 2.577376] RIP: 0010:page_fault+0x11/0x60
[ 2.577775] RSP: 0000:ffffffffff083fc8 EFLAGS: 00010046
[ 2.578314] RAX: 00000000819d0ac7 RBX: 0000000000000001 RCX: ffffffff819d0ac7
[ 2.578979] RDX: 0000000000000000 RSI: 0000000000000010 RDI: ffffffffff084078
[ 2.579666] RBP: 000000000000000b R08: 00000000ffffffff R09: 0000000000000040
[ 2.580334] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000000003
[ 2.581008] R13: 000055719fd7d410 R14: 0000000000000000 R15: 0000000003938700
[ 2.581684] ? native_iret+0x7/0x7
[ 2.582007] WARNING: can't dereference iret registers at ffffffffff084048 for ip page_fault+0x11/0x60
[ 2.582008] </#DF>
[ 2.583134] <SYSENTER>
[ 2.583367] ? __do_page_fault+0x4b0/0x4b0
[ 2.583751] ? page_fault+0x2c/0x60
[ 2.584127] ? native_iret+0x7/0x7
[ 2.584466] ? __do_page_fault+0x4b0/0x4b0
[ 2.584860] ? page_fault+0x2c/0x60
[ 2.585195] ? call_function_interrupt+0xc0/0xc0
[ 2.585621] </SYSENTER>
[ 2.586966] Dumping ftrace buffer:
[ 2.587254] (ftrace buffer empty)
[ 2.587534] Kernel Offset: disabled
[ 2.587814] ---[ end Kernel panic - not syncing: Machine halted.
Powered by blists - more mailing lists