lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 24 May 2017 15:42:59 +0200
From:   Milian Wolff <milian.wolff@...b.com>
To:     Milian Wolff <milian.wolff@...b.com>
Cc:     Namhyung Kim <namhyung@...nel.org>, Linux-kernel@...r.kernel.org,
        linux-perf-users@...r.kernel.org, kernel-team@....com
Subject: Re: [PATCH 0/7] generate full callchain cursor entries for inlined frames

On Wednesday, May 24, 2017 1:46:04 PM CEST Milian Wolff wrote:
> On Monday, May 22, 2017 11:06:43 AM CEST Namhyung Kim wrote:
> > Hi Milian,
> > 
> > On Thu, May 18, 2017 at 10:05:36PM +0200, Milian Wolff wrote:
> > > On Donnerstag, 18. Mai 2017 21:34:04 CEST Milian Wolff wrote:
> > > > This series of patches completely reworks the way inline frames are
> > > > handled. Instead of querying for the inline nodes on-demand in the
> > > > individual tools, we now create proper callchain nodes for inlined
> > > > frames. The advantages this approach brings are numerous:
> > > > 
> > > > - less duplicated code in the individual browser
> > > > - aggregated cost for inlined frames for the --children top-down list
> > > > - various bug fixes that arose from querying for a srcline/symbol
> > > > based
> > > > on
> > > > 
> > > >   the IP of a sample, which will always point to the last inlined
> > > >   frame
> > > >   instead of the corresponding non-inlined frame
> > > > 
> > > > - overall much better support for visualizing cost for heavily-inlined
> > > > C++
> > > > 
> > > >   code, which simply was confusing and unreliably before
> > > > 
> > > > - srcline honors the global setting as to whether full paths or
> > > > basenames
> > > > 
> > > >   should be shown
> > > > 
> > > > For comparison, below lists the output before and after for `perf
> > > > script`
> > > 
> > > > and `perf report`. The example file I used to generate the perf data 
is:
> > > And of course shortly after sending this patch series I notice the first
> > > issues ;-) The new behavior shows confusing results for `-g function`
> > > because match_chain uses sym->start. I fixed this locally to compare the
> > > actual
> > 
> > > function name if either of the two symbols is an inlined fake symbol:
> > Why not making the fake symbol has start addr of the sample IP and
> > length of 1.  The histogram sort code also compares the sym->start
> > which might confuse the output of the children mode too IMHO.
> 
> I can try that out, thank you for the suggestion. But I think it can easily
> break in different ways. I.e. when the same inline function gets used at
> different IPs, it should actually be considered to be the same function when
> we group/merge/aggregate. I updated the `match_chain` function accordingly,
> to do a symname / srcline comparison on inlined frames, instead of relying
> on the symbol start/end. I think using the IP for the fake symbols won't be
> more reliable here, don't you think?
> 
> In the end, I think we'll always have to special-case inlined fake symbols
> when we aggregate data, since the sym start/end is always going to be some
> arbitrary value that may or may not be what we want it to be. Doing the
> explicit comparison on e.g. srcline/symname is always going to be the most
> reliable option, as it also directly results in a proper aggregation based
> on the strings that the user will see in the end.

I haven't yet tried it out, but I think I can come up with a way to break your 
approach easily. Assume the following pseudo-code:

void tail()
{
    instr1; // IP1
    instr2; // IP2
}

void mid()
{
    tail();
}

void main()
{
    mid();
}

Now, assume both `tail` and `mid` get inlined into `main`. If we get one 
sample each for both IP1 and IP2, we want the following merged structure if we 
merge based on symbol:

sym  | incl | self
main | 2    | 0
mid  | 2    | 0
tail | 2    | 2

If we would give the inlined fake-symbols a start of the IP, i.e. either IP1 
or IP2, then we would end up with this (unexpected) behavior instead:

sym  | incl | self
main | 2    | 0
mid  | 1    | 0
mid  | 1    | 0
tail | 1    | 1
tail | 1    | 1

The reason is that the fake symbols for the inlined frames would be considered 
to be different functions since their start/end are not equal. This is "wrong" 
in my eyes - we really have to do symbol name comparisons for inlined frames, 
and also include srcline if that is desired.

If you think the above is not a valid assessment, I'll try to change my patch 
series to use the IP + 1 trick you suggest. But I really don't think it's 
going to work.

Cheers
-- 
Milian Wolff | milian.wolff@...b.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
Download attachment "smime.p7s" of type "application/pkcs7-signature" (3826 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ