[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160623155531.i2kbnokosskimvmf@treble>
Date: Thu, 23 Jun 2016 10:55:31 -0500
From: Josh Poimboeuf <jpoimboe@...hat.com>
To: Andy Lutomirski <luto@...capital.net>
Cc: Jiri Kosina <jikos@...nel.org>, Ingo Molnar <mingo@...hat.com>,
X86 ML <x86@...nel.org>,
Heiko Carstens <heiko.carstens@...ibm.com>,
"linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
live-patching@...r.kernel.org,
Michael Ellerman <mpe@...erman.id.au>,
Chris J Arges <chris.j.arges@...onical.com>,
Jessica Yu <jeyu@...hat.com>, linuxppc-dev@...ts.ozlabs.org,
Petr Mladek <pmladek@...e.com>, Jiri Slaby <jslaby@...e.cz>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Vojtech Pavlik <vojtech@...e.com>,
Miroslav Benes <mbenes@...e.cz>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ
tracking
On Wed, Jun 22, 2016 at 05:09:11PM -0700, Andy Lutomirski wrote:
> On Wed, Jun 22, 2016 at 9:30 AM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> > Andy,
> >
> > So I got a chance to look at this some more. I'm thinking that to make
> > this feature more consistently useful, we shouldn't only annotate
> > pt_regs frames for calls to handlers; other calls should be annotated as
> > well: preempt_schedule_irq, CALL_enter_from_user_mode,
> > prepare_exit_to_usermode, SWAPGS, TRACE_IRQS_OFF, DISABLE_INTERRUPTS,
> > etc. That way, the unwinder will always be able to find pt_regs from an
> > interrupt/exception, even if starting from one of these other calls.
> >
> > But then, things get ugly. You have to either setup and tear down the
> > frame for every possible call, or do a higher-level setup/teardown
> > across multiple calls, which invalidates several assumptions in the
> > entry code about the location of pt_regs on the stack.
> >
>
> Here's yet another harebrained idea. Maybe it works better than my
> previous harebrained ideas :)
>
> Your patch is already creating a somewhat nonstandard stack frame:
>
> + movq %rbp, 0*8(%rsp)
> + movq $entry_frame_ret, 1*8(%rsp)
> + movq %rsp, %rbp
>
> It's kind of a normal stack frame, but rbp points at something odd,
> and to unwind it fully correctly, the unwinder needs to know about it.
>
> What if we made it even more special, along the lines of:
>
> leaq offset_to_ptregs(%rsp), %rbp
> xorq $-1, %rbp
>
> IOW, don't write anything to the stack at all, and just put a special
> value into RBP that says "the next frame is pt_regs at such-and-such
> address". Do this once on entry and make sure to restore RBP (from
> pt_regs) on exit. Now the unwinder can notice that RBP has the high
> bits clear *and* that the negation of it points to the stack, and it
> can figure out what's going on.
>
> What do you think? Am I nuts or could this work?
>
> It had better not have much risk of breaking things worse than they
> currently are, given that current kernel allow user code to stick any
> value it likes into the very last element of the RBP chain.
I think it's a good idea, and it could work... BUT it would break
external unwinders like gdb for the in-kernel entry case.
For interrupts and exceptions in kernel mode, rbp *is* valid. Sure, it
doesn't tell you the interrupted function, but it does tell you its
caller. A generic frame pointer unwinder skips the interrupted
function, but at least it keeps going. If we encoded rbp on entry, that
would break.
--
Josh
Powered by blists - more mailing lists