[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1535446.SrsFzc0SgI@milian-kdab2>
Date: Wed, 24 May 2017 13:46:04 +0200
From: Milian Wolff <milian.wolff@...b.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
kernel-team@....com
Subject: Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames
On Monday, May 22, 2017 11:06:43 AM CEST Namhyung Kim wrote:
> Hi Milian,
>
> On Thu, May 18, 2017 at 10:05:36PM +0200, Milian Wolff wrote:
> > On Donnerstag, 18. Mai 2017 21:34:04 CEST Milian Wolff wrote:
> > > This series of patches completely reworks the way inline frames are
> > > handled. Instead of querying for the inline nodes on-demand in the
> > > individual tools, we now create proper callchain nodes for inlined
> > > frames. The advantages this approach brings are numerous:
> > >
> > > - less duplicated code in the individual browser
> > > - aggregated cost for inlined frames for the --children top-down list
> > > - various bug fixes that arose from querying for a srcline/symbol based
> > > on
> > >
> > > the IP of a sample, which will always point to the last inlined frame
> > > instead of the corresponding non-inlined frame
> > >
> > > - overall much better support for visualizing cost for heavily-inlined
> > > C++
> > >
> > > code, which simply was confusing and unreliably before
> > >
> > > - srcline honors the global setting as to whether full paths or
> > > basenames
> > >
> > > should be shown
> > >
> > > For comparison, below lists the output before and after for `perf
> > > script`
> >
> > > and `perf report`. The example file I used to generate the perf data is:
> > And of course shortly after sending this patch series I notice the first
> > issues ;-) The new behavior shows confusing results for `-g function`
> > because match_chain uses sym->start. I fixed this locally to compare the
> > actual
> > function name if either of the two symbols is an inlined fake symbol:
>
> Why not making the fake symbol has start addr of the sample IP and
> length of 1. The histogram sort code also compares the sym->start
> which might confuse the output of the children mode too IMHO.
I can try that out, thank you for the suggestion. But I think it can easily
break in different ways. I.e. when the same inline function gets used at
different IPs, it should actually be considered to be the same function when
we group/merge/aggregate. I updated the `match_chain` function accordingly, to
do a symname / srcline comparison on inlined frames, instead of relying on the
symbol start/end. I think using the IP for the fake symbols won't be more
reliable here, don't you think?
In the end, I think we'll always have to special-case inlined fake symbols
when we aggregate data, since the sym start/end is always going to be some
arbitrary value that may or may not be what we want it to be. Doing the
explicit comparison on e.g. srcline/symname is always going to be the most
reliable option, as it also directly results in a proper aggregation based on
the strings that the user will see in the end.
Cheers
--
Milian Wolff | milian.wolff@...b.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
Download attachment "smime.p7s" of type "application/pkcs7-signature" (3826 bytes)
Powered by blists - more mailing lists