lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200506113714.GA5281@hirez.programming.kicks-ass.net>
Date:   Wed, 6 May 2020 13:37:14 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Stephane Eranian <eranian@...gle.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Vince Weaver <vincent.weaver@...ne.edu>, jpoimboe@...hat.com,
        "Liang, Kan" <kan.liang@...el.com>, Andi Kleen <ak@...ux.intel.com>
Subject: Re: callchain ABI change with commit 6cbc304f2f360

On Tue, May 05, 2020 at 08:37:40PM -0700, Stephane Eranian wrote:
> Hi,
> 
> I have received reports from users who have noticed a change of
> behaviour caused by
> commit:
> 
> 6cbc304f2f360 ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")
> 
> When using PEBS sampling on Intel processors.
> 
> Doing simple profiling with:
> $ perf record -g -e cycles:pp ...
> 
> Before:
> 
> 1 1595951041120856 0x7f77f8 [0xe8]: PERF_RECORD_SAMPLE(IP, 0x4002):
> 795385/690513: 0x558aa66a9607 period: 10000019 addr: 0
> ... FP chain: nr:22
> .....  0: fffffffffffffe00
> .....  1: 0000558aa66a9607
> .....  2: 0000558aa66a8751
> .....  3: 0000558a984a3d4f
> 
> Entry 1: matches sampled IP 0x558aa66a9607.
> 
> After:
> 
> 3 487420973381085 0x2f797c0 [0x90]: PERF_RECORD_SAMPLE(IP, 0x4002):
> 349591/146458: 0x559dcd2ef889 period: 10000019 addr: 0
> ... FP chain: nr:11
> .....  0: fffffffffffffe00
> .....  1: 0000559dcd2ef88b
> .....  2: 0000559dcd19787d
> .....  3: 0000559dcd1cf1be
> 
> entry 1 does not match sampled IP anymore.
> 
> Before the patch the kernel was stashing the sampled IP from PEBS into
> the callchain. After the patch it is stashing the interrupted IP, thus
> with the skid.
> 
> I am trying to understand whether this is an intentional change or not
> for the IP.
> 
> It seems that stashing the interrupted IP would be more consistent across all
> sampling modes, i.e., with and without PEBS. Entry 1: would always be
> the interrupted IP.
> The changelog talks about ORC unwinder being more happy this the
> interrupted machine
> state, but not about the ABI expectation here.
> Could you clarify?

Intentional; fundamentally, we cannot unwind a stack that no longer
exists.

The PEBS record comes in after the fact, the stack at the time of record
is irretrievably gone. The only (and best) thing we can do is provide
the unwind at the interrupt.

Adding a previous IP on top of a later unwind gives a completely
insane/broken call-stacks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ