linux-kernel - Re: callchain ABI change with commit 6cbc304f2f360

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CABPqkBR5yoocv=6P_EDbMR64Pdyom6VKHOn7b6XnAi2Lf_z4Mg@mail.gmail.com>
Date:   Wed, 6 May 2020 11:48:56 -0700
From:   Stephane Eranian <eranian@...gle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Vince Weaver <vincent.weaver@...ne.edu>, jpoimboe@...hat.com,
        "Liang, Kan" <kan.liang@...el.com>, Andi Kleen <ak@...ux.intel.com>
Subject: Re: callchain ABI change with commit 6cbc304f2f360

On Wed, May 6, 2020 at 4:37 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, May 05, 2020 at 08:37:40PM -0700, Stephane Eranian wrote:
> > Hi,
> >
> > I have received reports from users who have noticed a change of
> > behaviour caused by
> > commit:
> >
> > 6cbc304f2f360 ("perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)")
> >
> > When using PEBS sampling on Intel processors.
> >
> > Doing simple profiling with:
> > $ perf record -g -e cycles:pp ...
> >
> > Before:
> >
> > 1 1595951041120856 0x7f77f8 [0xe8]: PERF_RECORD_SAMPLE(IP, 0x4002):
> > 795385/690513: 0x558aa66a9607 period: 10000019 addr: 0
> > ... FP chain: nr:22
> > .....  0: fffffffffffffe00
> > .....  1: 0000558aa66a9607
> > .....  2: 0000558aa66a8751
> > .....  3: 0000558a984a3d4f
> >
> > Entry 1: matches sampled IP 0x558aa66a9607.
> >
> > After:
> >
> > 3 487420973381085 0x2f797c0 [0x90]: PERF_RECORD_SAMPLE(IP, 0x4002):
> > 349591/146458: 0x559dcd2ef889 period: 10000019 addr: 0
> > ... FP chain: nr:11
> > .....  0: fffffffffffffe00
> > .....  1: 0000559dcd2ef88b
> > .....  2: 0000559dcd19787d
> > .....  3: 0000559dcd1cf1be
> >
> > entry 1 does not match sampled IP anymore.
> >
> > Before the patch the kernel was stashing the sampled IP from PEBS into
> > the callchain. After the patch it is stashing the interrupted IP, thus
> > with the skid.
> >
> > I am trying to understand whether this is an intentional change or not
> > for the IP.
> >
> > It seems that stashing the interrupted IP would be more consistent across all
> > sampling modes, i.e., with and without PEBS. Entry 1: would always be
> > the interrupted IP.
> > The changelog talks about ORC unwinder being more happy this the
> > interrupted machine
> > state, but not about the ABI expectation here.
> > Could you clarify?
>
> Intentional; fundamentally, we cannot unwind a stack that no longer
> exists.
>
Ok, thanks for clarifying this.

> The PEBS record comes in after the fact, the stack at the time of record
> is irretrievably gone. The only (and best) thing we can do is provide
> the unwind at the interrupt.
>
The PEBS record is always at an IP BEFORE or EQUAL to the interrupted IP.
The stack at PEBS may be gone in case the PEBS sample was captured at the
epilogue of the function where sp/rbp are modified.

> Adding a previous IP on top of a later unwind gives a completely
> insane/broken call-stacks.

I agree that using the interrupted IP is the most reliable thing to do.

You can get the callstack at the PEBS sample with LBR callstack on Icelake
because PEBS can record LBR. I am hoping this works with the existing code.