linux-kernel - Re: [PATCH/RFC 0/4] perf annotate: Add --source-only option and the new source code TUI view

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c99925a7-6369-d4d6-5227-3b39f5aca8ed@gmail.com>
Date:   Thu, 29 Jun 2017 01:27:34 +0900
From:   Taeung Song <treeze.taeung@...il.com>
To:     Milian Wolff <milian.wolff@...b.com>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        linux-kernel@...r.kernel.org,
        Adrian Hunter <adrian.hunter@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        David Ahern <dsahern@...il.com>,
        Jin Yao <yao.jin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Kim Phillips <kim.phillips@....com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Wang Nan <wangnan0@...wei.com>
Subject: Re: [PATCH/RFC 0/4] perf annotate: Add --source-only option and the
 new source code TUI view



On 06/28/2017 06:53 PM, Milian Wolff wrote:
> On Wednesday, June 28, 2017 5:18:08 AM CEST Taeung Song wrote:
>> Hi,
>>
>> The --source-only option and new source code TUI view can show
>> the result of performance analysis based on full source code per
>> symbol(function). (Namhyung Kim told me this idea and it was also requested
>> by others some time ago..)
>>
>> If someone wants to see the cause, he/she will need to dig into the asm.
>> But before that, looking at the source level can give a hint or clue
>> for the problem.
>>
>> For example, if target symbol is 'hex2u64' of util/util.c,
>> the output is like below.
>>
>>      $ perf annotate --source-only --stdio -s hex2u64
>>   Percent |      Source code of util.c for cycles:ppp (42 samples)
>> -----------------------------------------------------------------
>>      0.00 : 354   * While we find nice hex chars, build a long_val.
>>      0.00 : 355   * Return number of chars processed.
>>      0.00 : 356   */
>>      0.00 : 357  int hex2u64(const char *ptr, u64 *long_val)
>>      2.38 : 358  {
>>      2.38 : 359          const char *p = ptr;
>>      0.00 : 360          *long_val = 0;
>>      0.00 : 361
>>     30.95 : 362          while (*p) {
>>     23.81 : 363                  const int hex_val = hex(*p);
>>      0.00 : 364
>>     14.29 : 365                  if (hex_val < 0)
>>      0.00 : 366                          break;
>>      0.00 : 367
>>     26.19 : 368                  *long_val = (*long_val << 4) | hex_val;
>>      0.00 : 369                  p++;
>>      0.00 : 370          }
>>      0.00 : 371
>>      0.00 : 372          return p - ptr;
>>      0.00 : 373  }
>>
>> And I added many perf developers into Cc: because I want to listen to your
>> opinions about this new feature, if you don't mind.
>>
>> If you give some feedback, I'd appreciate it! :)
> 
> Thanks Taeung,
> 
> I requested this feature some time ago and it's really cool to see someone
> step up and implement it - much appreciated!

Thank you so much, Milian !! :)

> 
> I just tested it out on my pet-example that leverages C++ instead of C:
> 
> ~~~~~
> #include <complex>
> #include <cmath>
> #include <random>
> #include <iostream>
> 
> using namespace std;
> 
> int main()
> {
>      uniform_real_distribution<double> uniform(-1E5, 1E5);
>      default_random_engine engine;
>      double s = 0;
>      for (int i = 0; i < 10000000; ++i) {
>          s += norm(complex<double>(uniform(engine), uniform(engine)));
>      }
>      cout << s << '\n';
>      return 0;
> }
> ~~~~~
> 
> Compile it with:
> 
> g++ -O2 -g -std=c++11 test.cpp -o test
> 
> Then record it with perf:
> 
> perf record --call-graph dwarf ./test
> 
> Then analyse it with `perf report`. You'll see one entry for main with
> something like:
> 
> +  100.00%    39.69%  cpp-inlining  cpp-inlining      [.] main
> 
> Select it and annotate it, then switch to your new source-only view:
> 
> main  test.cpp
>         │  30
>         │  31    using namespace std;
>         │  32
>         │  33    int main()
>         │+ 34    {
>         │  35        uniform_real_distribution<double> uniform(-1E5, 1E5);
>         │  36        default_random_engine engine;
>         │+ 37        double s = 0;
>         │+ 38        for (int i = 0; i < 10000000; ++i) {
>    4.88 │+ 39            s += norm(complex<double>(uniform(engine),
> uniform(engine)));
>         │  40        }
>         │  41        cout << s << '\n';
>         │  42        return 0;
>         │+ 43    }
> 
> Note: the line numbers are off b/c my file contains a file-header on-top.
> Ignore that.
> 
> Note2: There is no column header shown, so it's unclear what the first column
> represents.
> 
> Note 3: report showed 39.69% self cost in main, 100.00% inclusive. annotate
> shows 4.88... What is that?
> 
> What this shows, is that it's extremely important to visualize inclusive cost
> _and_ self cost in this view. Additionally, we need to account for inlining.
> Right now, we only see the self cost that is directly within main, I suspect.
> For C++ this is usually very misleading, and basically makes the annotate view
> completely useless for application-level profiling. If a second column would
> be added with the inclusive cost with the ability to drill down, then I could
> easily see myself using this view.
> 
> I would appreciate if you could take this into account.
> 
> Thanks a lot
> 
> 

Sure, I got it.
I'll investigate this weird case and recheck this patchset based on your 
comments,
and then I'll reply again. :)

Thanks,
Taeung