linux-kernel - Re: [PATCH/RFC 0/4] perf annotate: Add --source-only option and the new source code TUI view

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b8e6cc17-599c-98aa-6dc6-284a923e4fa5@gmail.com>
Date:   Fri, 30 Jun 2017 16:14:46 +0900
From:   Taeung Song <treeze.taeung@...il.com>
To:     Namhyung Kim <namhyung@...nel.org>,
        Milian Wolff <milian.wolff@...b.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>
Cc:     linux-kernel@...r.kernel.org,
        Adrian Hunter <adrian.hunter@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        David Ahern <dsahern@...il.com>,
        Jin Yao <yao.jin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Kim Phillips <kim.phillips@....com>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Wang Nan <wangnan0@...wei.com>, kernel-team@....com
Subject: Re: [PATCH/RFC 0/4] perf annotate: Add --source-only option and the
 new source code TUI view

Hi,

On 06/29/2017 04:11 PM, Namhyung Kim wrote:
> Hello,
> 
> On Wed, Jun 28, 2017 at 11:53:22AM +0200, Milian Wolff wrote:
>> On Wednesday, June 28, 2017 5:18:08 AM CEST Taeung Song wrote:
>>> Hi,
>>>
>>> The --source-only option and new source code TUI view can show
>>> the result of performance analysis based on full source code per
>>> symbol(function). (Namhyung Kim told me this idea and it was also requested
>>> by others some time ago..)
>>>
>>> If someone wants to see the cause, he/she will need to dig into the asm.
>>> But before that, looking at the source level can give a hint or clue
>>> for the problem.
>>>
>>> For example, if target symbol is 'hex2u64' of util/util.c,
>>> the output is like below.
>>>
>>>      $ perf annotate --source-only --stdio -s hex2u64
>>>   Percent |      Source code of util.c for cycles:ppp (42 samples)
>>> -----------------------------------------------------------------
>>>      0.00 : 354   * While we find nice hex chars, build a long_val.
>>>      0.00 : 355   * Return number of chars processed.
>>>      0.00 : 356   */
>>>      0.00 : 357  int hex2u64(const char *ptr, u64 *long_val)
>>>      2.38 : 358  {
>>>      2.38 : 359          const char *p = ptr;
>>>      0.00 : 360          *long_val = 0;
>>>      0.00 : 361
>>>     30.95 : 362          while (*p) {
>>>     23.81 : 363                  const int hex_val = hex(*p);
>>>      0.00 : 364
>>>     14.29 : 365                  if (hex_val < 0)
>>>      0.00 : 366                          break;
>>>      0.00 : 367
>>>     26.19 : 368                  *long_val = (*long_val << 4) | hex_val;
>>>      0.00 : 369                  p++;
>>>      0.00 : 370          }
>>>      0.00 : 371
>>>      0.00 : 372          return p - ptr;
>>>      0.00 : 373  }
>>>
>>> And I added many perf developers into Cc: because I want to listen to your
>>> opinions about this new feature, if you don't mind.
>>>
>>> If you give some feedback, I'd appreciate it! :)
>>
>> Thanks Taeung,
>>
>> I requested this feature some time ago and it's really cool to see someone
>> step up and implement it - much appreciated!
>>
>> I just tested it out on my pet-example that leverages C++ instead of C:
>>
>> ~~~~~
>> #include <complex>
>> #include <cmath>
>> #include <random>
>> #include <iostream>
>>
>> using namespace std;
>>
>> int main()
>> {
>>      uniform_real_distribution<double> uniform(-1E5, 1E5);
>>      default_random_engine engine;
>>      double s = 0;
>>      for (int i = 0; i < 10000000; ++i) {
>>          s += norm(complex<double>(uniform(engine), uniform(engine)));
>>      }
>>      cout << s << '\n';
>>      return 0;
>> }
>> ~~~~~
>>
>> Compile it with:
>>
>> g++ -O2 -g -std=c++11 test.cpp -o test
>>
>> Then record it with perf:
>>
>> perf record --call-graph dwarf ./test
>>
>> Then analyse it with `perf report`. You'll see one entry for main with
>> something like:
>>
>> +  100.00%    39.69%  cpp-inlining  cpp-inlining      [.] main
>>
>> Select it and annotate it, then switch to your new source-only view:
>>
>> main  test.cpp
>>         │  30                                                                                                                                                                                             >        │  31    using namespace std;                                                                                                                                                                     >        │  32                                                                                                                                                                                             >        │  33    int main()                                                                                                                                                                               >        │+ 34    {                                                                                                                                                                                        >        │  35        uniform_real_distribution<double> uniform(-1E5, 1E5);                                                                                                                                >        │  36        default_random_engine engine;                                                                                                                                                        >        │+ 37        double s = 0;                                                                                                                                                                        >        │+ 38        for (int i = 0; i < 10000000; ++i) {                                                                                                                                                 >   4.88 │+ 39            s += norm(complex<double>(uniform(engine), uniform(engine)));                                                                                                                    >        │  40        }                                                                                                                                                                                    >        │  41        cout << s << '\n';                                                                                                                                                                   >        │  42        return 0;                                                                                                                                                                            >        │+ 43    }
>>
>> Note: the line numbers are off b/c my file contains a file-header on-top.
>> Ignore that.
>>
>> Note2: There is no column header shown, so it's unclear what the first column
>> represents.
>>
>> Note 3: report showed 39.69% self cost in main, 100.00% inclusive. annotate
>> shows 4.88... What is that?
>>
>> What this shows, is that it's extremely important to visualize inclusive cost
>> _and_ self cost in this view. Additionally, we need to account for inlining.
>> Right now, we only see the self cost that is directly within main, I suspect.
> 
> Currently perf annotate doesn't use the sample period, it uses sample
> count instead and print the percentage within the function.  So it's a
> different number to the perf report.  I think we need to fix this
> first.
> 
> Thanks,
> Namhyung
> 

I understood. Hum.. so we need to replace the menu and column about 
total period to things about sample count like below ?

"t             Toggle total period view"

   -> "t             Toggle showing the number of samples for"

(I'm not sure what a short key(e.g. 't') is proper..)

Or modifying the code related to the number of samples,
show actual total period on perf-annotate ?

What do you think about this change ?

Thanks,
Taeung

> 
>> For C++ this is usually very misleading, and basically makes the annotate view
>> completely useless for application-level profiling. If a second column would
>> be added with the inclusive cost with the ability to drill down, then I could
>> easily see myself using this view.
>>
>> I would appreciate if you could take this into account.
>>
>> Thanks a lot
>>
>>
>> -- 
>> Milian Wolff | milian.wolff@...b.com | Senior Software Engineer
>> KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
>> Tel: +49-30-521325470
>> KDAB - The Qt Experts
> 
>