[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200506113714.GA5281@hirez.programming.kicks-ass.net>
Date: Wed, 6 May 2020 13:37:14 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Stephane Eranian <eranian@...gle.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Vince Weaver <vincent.weaver@...ne.edu>, jpoimboe@...hat.com,
"Liang, Kan" <kan.liang@...el.com>, Andi Kleen <ak@...ux.intel.com>
Subject: Re: callchain ABI change with commit 6cbc304f2f360
On Tue, May 05, 2020 at 08:37:40PM -0700, Stephane Eranian wrote:
> Hi,
>
> I have received reports from users who have noticed a change of
> behaviour caused by
> commit:
>
> 6cbc304f2f360 ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")
>
> When using PEBS sampling on Intel processors.
>
> Doing simple profiling with:
> $ perf record -g -e cycles:pp ...
>
> Before:
>
> 1 1595951041120856 0x7f77f8 [0xe8]: PERF_RECORD_SAMPLE(IP, 0x4002):
> 795385/690513: 0x558aa66a9607 period: 10000019 addr: 0
> ... FP chain: nr:22
> ..... 0: fffffffffffffe00
> ..... 1: 0000558aa66a9607
> ..... 2: 0000558aa66a8751
> ..... 3: 0000558a984a3d4f
>
> Entry 1: matches sampled IP 0x558aa66a9607.
>
> After:
>
> 3 487420973381085 0x2f797c0 [0x90]: PERF_RECORD_SAMPLE(IP, 0x4002):
> 349591/146458: 0x559dcd2ef889 period: 10000019 addr: 0
> ... FP chain: nr:11
> ..... 0: fffffffffffffe00
> ..... 1: 0000559dcd2ef88b
> ..... 2: 0000559dcd19787d
> ..... 3: 0000559dcd1cf1be
>
> entry 1 does not match sampled IP anymore.
>
> Before the patch the kernel was stashing the sampled IP from PEBS into
> the callchain. After the patch it is stashing the interrupted IP, thus
> with the skid.
>
> I am trying to understand whether this is an intentional change or not
> for the IP.
>
> It seems that stashing the interrupted IP would be more consistent across all
> sampling modes, i.e., with and without PEBS. Entry 1: would always be
> the interrupted IP.
> The changelog talks about ORC unwinder being more happy this the
> interrupted machine
> state, but not about the ABI expectation here.
> Could you clarify?
Intentional; fundamentally, we cannot unwind a stack that no longer
exists.
The PEBS record comes in after the fact, the stack at the time of record
is irretrievably gone. The only (and best) thing we can do is provide
the unwind at the interrupt.
Adding a previous IP on top of a later unwind gives a completely
insane/broken call-stacks.
Powered by blists - more mailing lists