linux-kernel - Re: [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 23 Jun 2016 10:55:31 -0500
From:	Josh Poimboeuf <jpoimboe@...hat.com>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	Jiri Kosina <jikos@...nel.org>, Ingo Molnar <mingo@...hat.com>,
	X86 ML <x86@...nel.org>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	"linux-s390@...r.kernel.org" <linux-s390@...r.kernel.org>,
	live-patching@...r.kernel.org,
	Michael Ellerman <mpe@...erman.id.au>,
	Chris J Arges <chris.j.arges@...onical.com>,
	Jessica Yu <jeyu@...hat.com>, linuxppc-dev@...ts.ozlabs.org,
	Petr Mladek <pmladek@...e.com>, Jiri Slaby <jslaby@...e.cz>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Vojtech Pavlik <vojtech@...e.com>,
	Miroslav Benes <mbenes@...e.cz>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RFC PATCH v2 05/18] sched: add task flag for preempt IRQ
 tracking

On Wed, Jun 22, 2016 at 05:09:11PM -0700, Andy Lutomirski wrote:
> On Wed, Jun 22, 2016 at 9:30 AM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> > Andy,
> >
> > So I got a chance to look at this some more.  I'm thinking that to make
> > this feature more consistently useful, we shouldn't only annotate
> > pt_regs frames for calls to handlers; other calls should be annotated as
> > well: preempt_schedule_irq, CALL_enter_from_user_mode,
> > prepare_exit_to_usermode, SWAPGS, TRACE_IRQS_OFF, DISABLE_INTERRUPTS,
> > etc.  That way, the unwinder will always be able to find pt_regs from an
> > interrupt/exception, even if starting from one of these other calls.
> >
> > But then, things get ugly.  You have to either setup and tear down the
> > frame for every possible call, or do a higher-level setup/teardown
> > across multiple calls, which invalidates several assumptions in the
> > entry code about the location of pt_regs on the stack.
> >
> 
> Here's yet another harebrained idea.  Maybe it works better than my
> previous harebrained ideas :)
> 
> Your patch is already creating a somewhat nonstandard stack frame:
> 
> +       movq    %rbp,                   0*8(%rsp)
> +       movq    $entry_frame_ret,       1*8(%rsp)
> +       movq    %rsp, %rbp
> 
> It's kind of a normal stack frame, but rbp points at something odd,
> and to unwind it fully correctly, the unwinder needs to know about it.
> 
> What if we made it even more special, along the lines of:
> 
> leaq offset_to_ptregs(%rsp), %rbp
> xorq $-1, %rbp
> 
> IOW, don't write anything to the stack at all, and just put a special
> value into RBP that says "the next frame is pt_regs at such-and-such
> address".  Do this once on entry and make sure to restore RBP (from
> pt_regs) on exit.  Now the unwinder can notice that RBP has the high
> bits clear *and* that the negation of it points to the stack, and it
> can figure out what's going on.
> 
> What do you think?  Am I nuts or could this work?
> 
> It had better not have much risk of breaking things worse than they
> currently are, given that current kernel allow user code to stick any
> value it likes into the very last element of the RBP chain.

I think it's a good idea, and it could work... BUT it would break
external unwinders like gdb for the in-kernel entry case.

For interrupts and exceptions in kernel mode, rbp *is* valid.  Sure, it
doesn't tell you the interrupted function, but it does tell you its
caller.  A generic frame pointer unwinder skips the interrupted
function, but at least it keeps going.  If we encoded rbp on entry, that
would break.

-- 
Josh