linux-kernel - Re: [PATCH 10/19] x86/dumpstack: add get_stack

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160722235459.xtikpj263hroloqo@treble>
Date:	Fri, 22 Jul 2016 18:54:59 -0500
From:	Josh Poimboeuf <jpoimboe@...hat.com>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...nel.org>,
	"H . Peter Anvin" <hpa@...or.com>, X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Brian Gerst <brgerst@...il.com>,
	Kees Cook <keescook@...omium.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Byungchul Park <byungchul.park@....com>
Subject: Re: [PATCH 10/19] x86/dumpstack: add get_stack_info() interface

On Fri, Jul 22, 2016 at 04:26:46PM -0700, Andy Lutomirski wrote:
> On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> > valid_stack_ptr() is buggy: it assumes that all stacks are of size
> > THREAD_SIZE, which is not true for exception stacks.  So the
> > walk_stack() callbacks will need to know the location of the beginning
> > of the stack as well as the end.
> >
> > Another issue is that in general the various features of a stack (type,
> > size, next stack pointer, description string) are scattered around in
> > various places throughout the stack dump code.
> 
> I finally figured out what visit_info is.  But would it make more
> sense to track it in the unwind state so that the unwinder can
> directly make sure it doesn't start looping?

Well, the unwinders aren't the only users of get_stack_info() and the
visit_mask.  show_trace_log_lvl() also uses it.

But it would probably be cleaner to at least do the visit_mask bit
testing/setting in get_stack_info() rather than in the in_*_stack()
functions.

> And please remove test_and_set_bit() -- it's pointlessly slow.

Ok.

> 
> > +static bool in_hardirq_stack(unsigned long *stack, struct stack_info *info,
> > +                            unsigned long *visit_mask)
> > +{
> > +       unsigned long *begin = (unsigned long *)this_cpu_read(hardirq_stack);
> > +       unsigned long *end   = begin + (THREAD_SIZE / sizeof(long));
> > +
> > +       if (stack < begin || stack >= end)
> > +               return false;
> > +
> > +       if (visit_mask && test_and_set_bit(STACK_TYPE_IRQ, visit_mask))
> > +               return false;
> > +
> > +       info->type      = STACK_TYPE_IRQ;
> > +       info->begin     = begin;
> > +       info->end       = end;
> > +       info->next      = (unsigned long *)*begin;
> 
> This works, but it's a bit magic.  I don't suppose we could get rid of
> this ->next thing entirely and teach show_stack_log_lvl(), etc. to
> move from stack to stack by querying the stack type of whatever the
> frame base address is if the frame base address ends up being out of
> bounds for the current stack?  Or maybe the unwinder could even do
> this by itself.

I'm not quite sure what you mean here.  The ->next stack pointer is
quite useful and it abstracts that ugliness away from the callers of
get_stack_info().  I'm open to any specific suggestions.

> 
> > +static bool in_exception_stack(unsigned long *s, struct stack_info *info,
> > +                              unsigned long *visit_mask)
> >  {
> >         unsigned long stack = (unsigned long)s;
> >         unsigned long begin, end;
> > @@ -44,55 +63,62 @@ static unsigned long *in_exception_stack(unsigned long *s, char **name,
> >                 if (stack < begin || stack >= end)
> >                         continue;
> >
> > -               if (test_and_set_bit(k, visit_mask))
> > +               if (visit_mask &&
> > +                   test_and_set_bit(STACK_TYPE_EXCEPTION + k, visit_mask))
> >                         return false;
> >
> > -               *name = exception_stack_names[k];
> > -               return (unsigned long *)end;
> > +               info->type      = STACK_TYPE_EXCEPTION + k;
> > +               info->begin     = (unsigned long *)begin;
> > +               info->end       = (unsigned long *)end;
> > +               info->next      = (unsigned long *)info->end[-2];
> 
> This is so magical that I don't immediately see why it's correct.
> Presumably it's because the thing two slots down from the top of the
> stack is regs->sp?  If so, that needs a comment.

Heck if I know, I just stole it from dump_trace() ;-)

I'll figure it out and add a comment.

> But again, couldn't we use the fact that we now know how to decode
> pt_regs to avoid needing this?  I can imagine it being useful as a
> fallback in the event that the unwinder fails, but this is just a
> fallback.

Yeah, this is needed as a fallback.  But I wouldn't call it "just" a
fallback: the stack dump code *needs* to be able to still traverse the
stacks if frame pointers fail.

> Also, NMI is weird and I'm wondering whether this works at
> all when trying to unwind from a looped NMI.

Unless I'm missing something, I think it should be fine for nested NMIs,
since they're all on the same stack.  I can try to test it.  What in
particular are you worried about?

> Fixing this up could be a followup after this series is in, I think --
> you're preserving existing behavior AFAICS.  I just don't particularly
> like the existing behavior.
> 
> FWIW, I think this code needs to be explicitly tested for the 32-bit
> double fault case.  It's highly magical.

Ok, I'll test it.

-- 
Josh