linux-kernel - Re: perf: fuzzer KASAN unwind_get_return

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161117160700.GF3117@twins.programming.kicks-ass.net>
Date:   Thu, 17 Nov 2016 17:07:00 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Josh Poimboeuf <jpoimboe@...hat.com>
Cc:     Vince Weaver <vincent.weaver@...ne.edu>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        "davej@...emonkey.org.uk" <davej@...emonkey.org.uk>,
        "dvyukov@...gle.com" <dvyukov@...gle.com>,
        Stephane Eranian <eranian@...il.com>
Subject: Re: perf: fuzzer KASAN unwind_get_return_address

On Thu, Nov 17, 2016 at 09:18:48AM -0600, Josh Poimboeuf wrote:
> On Thu, Nov 17, 2016 at 10:04:46AM +0100, Peter Zijlstra wrote:
> > On Wed, Nov 16, 2016 at 10:48:28PM -0600, Josh Poimboeuf wrote:
> > > Peter or Vince, can you try to recreate with this patch?  It dumps the
> > > raw stack contents during a stack dump.  Hopefully that would give a
> > > clue about what's going wrong.
> > 
> > 
> > Here goes... I'll do another run and get you the results of that as
> > well.
> 
> Thanks, I just waded through this and it turned up some good clues.  And
> according to 'git blame', you might be able to help :-)
> 
> It's not stack corruption.  Instead it looks like
> __intel_pmu_pebs_event() is creating a bad or stale pt_regs which gets
> passed to the unwinder.  Specifically, regs->bp points to a seemingly
> random address on the NMI stack.  Which seems odd, considering the code
> itself is running on the same NMI stack.
> 
> I don't know much about the PEBS code but it seems like it's passing
> some stale data.  Either that or there's some NMI nesting going on.

Ooh, indeed. The PEBS record can be quite stale by the time we get to
the interrupt. Using those registers for an unwind is 'interesting' at
best.

Esp. with the multi-pebs stuff that's landed this can be very very
stale, but even single pebs can have a radically different stack at
interrupt time than we had at record time -- imagine a (i)ret happening
in between.

Let me consider that code, and what to do about this; its been a while
since I went over all that.