lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 11 Nov 2018 19:26:37 -0800
From:   Andi Kleen <ak@...ux.intel.com>
To:     Travis Downs <travis.downs@...il.com>
Cc:     Milian Wolff <milian.wolff@...b.com>, jolsa@...hat.com,
        linux-kernel@...r.kernel.org, jolsa@...nel.org,
        namhyung@...nel.org, linux-perf-users@...r.kernel.org,
        acme@...nel.org
Subject: Re: PEBS level 2/3 breaks dwarf unwinding! [WAS: Re: Broken dwarf
 unwinding - wrong stack pointer register value?]

On Sat, Nov 10, 2018 at 09:50:05PM -0500, Travis Downs wrote:
>    On Sat, Nov 10, 2018 at 8:07 PM Andi Kleen <ak@...ux.intel.com> wrote:
> 
>      On Sat, Nov 10, 2018 at 04:42:48PM -0500, Travis Downs wrote:
>      > I guess this problem doesn't occur for LBR unwinding since the LBR
>      > records are captured at the same
>      > moment in time as the PEBS record, so reflect the correct branch
>      > sequence.
> 
>      Actually it happens with LBRs too, but it always gives the backtrace
>      consistently at the PMI trigger point.
> 
>    That's weird - so the LBR records are from the PMI point, but the rest of
>    the PEBS record comes from the PEBS trigger point? Or the LBR isn't part
>    of PEBS at all?

LBR is not part of PEBS, but is collected separately in the PMI handler.

>      > overhead calculations will be based on the captured stacks, I guess -
>      > but when I annotate, will the values I see correspond to the PEBS IPs
>      > or the PMI IPs?
> 
>      Based on PEBS IPs.
> 
>      It would be a good idea to add a check to perf report
>      that the two IPs are different, and if they differ
>      add some indicator to the sample. This could be a new sort key,
>      although that would waste some space on the screen, or something
>      else.
> 
>    In the case that PEBS events are used, the IP will differ essentially 100%
>    of the time, right? That is, there will always be *some* skid.

I wouldn't say that.  It depends on what the CPU is doing and the IPC
of the code.

Also the backtrace inconsistency can only happen if the sample races with
function return. If you don't then the backtrace will point
to the correct function, even though the unwind IP is different. 

For example in the common case where you profile a long loop it
is unlikely to happen.


>    indicating otherwise above), I could imagine a hybrid mode where LBR is
>    used to go back some number of calls and then dwarf or FP or whatever
>    unwinding takes over, because the further down the stack you do the more
>    likely the PEBS trigger point and PMI point are likely to have a
>    consistent stack.

Could collect numbers how often it happens, but it would surprise
me if anything complicated is worth it. I would just do the minimum fixes
to address the unwinder errors, and perhaps add the "unwind ip differs"
indication.

-Andi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ