linux-kernel - Re: [PATCH 10/19] x86/dumpstack: add get_stack

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrWLKgYB9KaSZgY3Hz9uVvYNrFc5Yw8moz=CdzJu4nN06g@mail.gmail.com>
Date:	Fri, 22 Jul 2016 16:26:46 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Josh Poimboeuf <jpoimboe@...hat.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...nel.org>,
	"H . Peter Anvin" <hpa@...or.com>, X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Brian Gerst <brgerst@...il.com>,
	Kees Cook <keescook@...omium.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Byungchul Park <byungchul.park@....com>
Subject: Re: [PATCH 10/19] x86/dumpstack: add get_stack_info() interface

On Thu, Jul 21, 2016 at 2:21 PM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> valid_stack_ptr() is buggy: it assumes that all stacks are of size
> THREAD_SIZE, which is not true for exception stacks.  So the
> walk_stack() callbacks will need to know the location of the beginning
> of the stack as well as the end.
>
> Another issue is that in general the various features of a stack (type,
> size, next stack pointer, description string) are scattered around in
> various places throughout the stack dump code.

I finally figured out what visit_info is.  But would it make more
sense to track it in the unwind state so that the unwinder can
directly make sure it doesn't start looping?

And please remove test_and_set_bit() -- it's pointlessly slow.

> +static bool in_hardirq_stack(unsigned long *stack, struct stack_info *info,
> +                            unsigned long *visit_mask)
> +{
> +       unsigned long *begin = (unsigned long *)this_cpu_read(hardirq_stack);
> +       unsigned long *end   = begin + (THREAD_SIZE / sizeof(long));
> +
> +       if (stack < begin || stack >= end)
> +               return false;
> +
> +       if (visit_mask && test_and_set_bit(STACK_TYPE_IRQ, visit_mask))
> +               return false;
> +
> +       info->type      = STACK_TYPE_IRQ;
> +       info->begin     = begin;
> +       info->end       = end;
> +       info->next      = (unsigned long *)*begin;

This works, but it's a bit magic.  I don't suppose we could get rid of
this ->next thing entirely and teach show_stack_log_lvl(), etc. to
move from stack to stack by querying the stack type of whatever the
frame base address is if the frame base address ends up being out of
bounds for the current stack?  Or maybe the unwinder could even do
this by itself.

> +static bool in_exception_stack(unsigned long *s, struct stack_info *info,
> +                              unsigned long *visit_mask)
>  {
>         unsigned long stack = (unsigned long)s;
>         unsigned long begin, end;
> @@ -44,55 +63,62 @@ static unsigned long *in_exception_stack(unsigned long *s, char **name,
>                 if (stack < begin || stack >= end)
>                         continue;
>
> -               if (test_and_set_bit(k, visit_mask))
> +               if (visit_mask &&
> +                   test_and_set_bit(STACK_TYPE_EXCEPTION + k, visit_mask))
>                         return false;
>
> -               *name = exception_stack_names[k];
> -               return (unsigned long *)end;
> +               info->type      = STACK_TYPE_EXCEPTION + k;
> +               info->begin     = (unsigned long *)begin;
> +               info->end       = (unsigned long *)end;
> +               info->next      = (unsigned long *)info->end[-2];

This is so magical that I don't immediately see why it's correct.
Presumably it's because the thing two slots down from the top of the
stack is regs->sp?  If so, that needs a comment.

But again, couldn't we use the fact that we now know how to decode
pt_regs to avoid needing this?  I can imagine it being useful as a
fallback in the event that the unwinder fails, but this is just a
fallback.  Also, NMI is weird and I'm wondering whether this works at
all when trying to unwind from a looped NMI.

Fixing this up could be a followup after this series is in, I think --
you're preserving existing behavior AFAICS.  I just don't particularly
like the existing behavior.

FWIW, I think this code needs to be explicitly tested for the 32-bit
double fault case.  It's highly magical.