lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Mar 2020 15:11:04 +0800
From:   Jin Yao <yao.jin@...ux.intel.com>
To:     acme@...nel.org, jolsa@...nel.org, peterz@...radead.org,
        mingo@...hat.com, alexander.shishkin@...ux.intel.com
Cc:     Linux-kernel@...r.kernel.org, ak@...ux.intel.com,
        kan.liang@...el.com, yao.jin@...el.com,
        Jin Yao <yao.jin@...ux.intel.com>
Subject: [PATCH v2 00/14] perf: Stream comparison

Sometimes, a small change in a hot function reducing the cycles of
this function, but the overall workload doesn't get faster. It is
interesting where the cycles are moved to.

What it would like is to diff before/after streams. A stream we think
is a callchain which is aggregated by the branch records from samples.

By browsing the hot streams, we can understand the hot code flow.
By comparing the cycles variation of same streams between old perf
data and new perf data, we can understand if the cycles are moved to
the unchanged code.

The before stream is the stream before source code changed
(e.g. streams in perf.data.old). The after stream is the stream
after source code changed (e.g. streams in perf.data).

Diffing before/after streams compares all streams (or compares top
N hot streams) between two perf data files.

If all entries of one stream in perf.data.old are fully matched with
all entries of another stream in perf.data, we think these two streams
are matched otherwise the streams are not matched.

For example,

   cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
--------------------------              --------------------------
             main div.c:39                           main div.c:39
             main div.c:44                           main div.c:44

It looks that two streams are matched and we can see for the same
streams the cycles are equal and the callchain hit percents are
slightly changed. That's expected in the normal range.

But that's not always true if source code is changed in perf.data
(e.g. source line div.c:39 is changed). If the source line is changed,
they are different streams, we can't compare them. We will think the
stream in perf.data is a new stream.

The challenge is how to identify the changed source lines. The basic
idea is to use linux command "diff" to compare the source file A and
source file A* line by line (assume A is used in perf.data.old
and A* is updated in perf.data). According to "diff" output, we can
create a source line mapping table.

For example,

  Execute 'diff ./before/div.c ./after/div.c'

  25c25
  <       i = rand() % 2;
  ---
  >       i = rand() % 4;
  39c39
  <       for (i = 0; i < 2000000000; i++) {
  ---
  >       for (i = 0; i < 20000000001; i++) {

  div.c (after -> before) lines mapping:
  0 -> 0
  1 -> 1
  2 -> 2
  3 -> 3
  4 -> 4
  5 -> 5
  6 -> 6
  7 -> 7
  8 -> 8
  9 -> 9
  ...
  24 -> 24
  25 -> -1
  26 -> 26
  27 -> 27
  28 -> 28
  29 -> 29
  30 -> 30
  31 -> 31
  32 -> 32
  33 -> 33
  34 -> 34
  35 -> 35
  36 -> 36
  37 -> 37
  38 -> 38
  39 -> -1
  40 -> 40
  ...

>From the table, we can easily know div.c:39 is source line changed.
(mapped to -1). So these two streams are not matched.

Besides the hot streams comparison, this patch series also support
the top N hottest blocks comparison.

It's also useful to figure out the top N hottest blocks from old perf
data file and figure out the top N hottest blocks from new perf data file,
and then compare them for the cycles diff. It can let us easily know
how many cycles are moved from one block to another block.

Now let's see examples.

perf record -b ...      Generate perf.data.old with branch data
perf record -b ...      Generate perf.data with branch data
perf diff --stream --percent-limit 2

[ Matched hot chains between old perf data and new perf data) ]

hot chain pair 1:
            cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
        ---------------------------              --------------------------
                      main div.c:39                           main div.c:39
                      main div.c:44                           main div.c:44

hot chain pair 2:
           cycles: 35, hits: 21.43%                cycles: 33, hits: 19.37%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40
                      main div.c:40                           main div.c:40
                      main div.c:39                           main div.c:39

hot chain pair 3:
            cycles: 18, hits: 6.10%                 cycles: 19, hits: 6.51%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40

hot chain pair 4:
             cycles: 9, hits: 5.95%                  cycles: 8, hits: 5.03%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380

[ Hot chains in old perf data but source line changed (*) in new perf data ]

[ Hot chains in old perf data only ]

hot chain 1:
             cycles: 2, hits: 4.08%
         --------------------------
                      main div.c:42
              compute_flag div.c:28

[ Hot chains in new perf data only ]

hot chain 1:
                                                    cycles: 36, hits: 3.36%
                                                 --------------------------
                                                  __random_r random_r.c:357
                                                      __random random.c:293
                                                      __random random.c:293
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:288
                                                             rand rand.c:27
                                                             rand rand.c:26
                                                                   rand@plt
                                                                   rand@plt
                                                      compute_flag div.c:25
                                                      compute_flag div.c:22
                                                              main div.c:40
                                                              main div.c:40

Ignore the rightmost columns such as '[Program Block Range]' and 'Shared Object' for saving space

# Output based on old perf data:
#
# Sampled Cycles%  Avg Cycles  New Stream Diff(cycles%,cycles)  New Stream Sampled Cycles%  New Stream Avg Cycles
# ...............  ..........  ...............................  ..........................  .....................
#
           25.20%          18                     -0.36%,   -1                           -                      -
           15.24%           7                     -0.45%,    0                           -                      -
            5.07%           2                      0.09%,    0                           -                      -
            4.84%           2                      0.26%,    0                           -                      -
            4.72%           2                      0.30%,    0                           -                      -
            3.91%           1                      0.29%,    0                           -                      -
            3.05%           1                      0.11%,    0                           -                      -
            2.90%           1                      0.08%,    0                           -                      -
            2.71%           1                     -0.11%,    0                           -                      -
            2.44%           1                      0.09%,    0                           -                      -
            2.35%           1                     -0.09%,    0                           -                      -
            2.27%           1                      0.15%,    0                           -                      -
            2.27%           1                      0.06%,    0                           -                      -
            2.17%           1                      0.09%,    0                           -                      -

If we enable the source line comparison, the output might be different.

perf diff --stream --before ./before --after ./after

[ Matched hot chains between old perf data and new perf data) ]

hot chain pair 1:
            cycles: 18, hits: 6.10%                 cycles: 19, hits: 6.51%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40

hot chain pair 2:
             cycles: 9, hits: 5.95%                  cycles: 8, hits: 5.03%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380

[ Hot chains in old perf data but source line changed (*) in new perf data ]

hot chain pair 1:
            cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
        ---------------------------              --------------------------
                      main div.c:39                           main div.c:39*
                      main div.c:44                           main div.c:44

hot chain pair 2:
           cycles: 35, hits: 21.43%                cycles: 33, hits: 19.37%
        ---------------------------              --------------------------
          __random_r random_r.c:360               __random_r random_r.c:360
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:388               __random_r random_r.c:388
          __random_r random_r.c:380               __random_r random_r.c:380
          __random_r random_r.c:357               __random_r random_r.c:357
              __random random.c:293                   __random random.c:293
              __random random.c:293                   __random random.c:293
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:291                   __random random.c:291
              __random random.c:288                   __random random.c:288
                     rand rand.c:27                          rand rand.c:27
                     rand rand.c:26                          rand rand.c:26
                           rand@plt                                rand@plt
                           rand@plt                                rand@plt
              compute_flag div.c:25                   compute_flag div.c:25
              compute_flag div.c:22                   compute_flag div.c:22
                      main div.c:40                           main div.c:40
                      main div.c:40                           main div.c:40
                      main div.c:39                           main div.c:39*

[ Hot chains in old perf data only ]

hot chain 1:
             cycles: 2, hits: 4.08%
         --------------------------
                      main div.c:42
              compute_flag div.c:28

[ Hot chains in new perf data only ]

hot chain 1:
                                                    cycles: 36, hits: 3.36%
                                                 --------------------------
                                                  __random_r random_r.c:357
                                                      __random random.c:293
                                                      __random random.c:293
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:291
                                                      __random random.c:288
                                                             rand rand.c:27
                                                             rand rand.c:26
                                                                   rand@plt
                                                                   rand@plt
                                                      compute_flag div.c:25
                                                      compute_flag div.c:22
                                                              main div.c:40
                                                              main div.c:40

# Output based on old perf data:
#
# Sampled Cycles%  Avg Cycles  New Stream Diff(cycles%,cycles)  New Stream Sampled Cycles%  New Stream Avg Cycles
# ...............  ..........  ...............................  ..........................  .....................
#
           25.20%          18    [block changed in new stream]                      24.84%                     17
           15.24%           7                     -0.45%,    0                           -                      -
            5.07%           2                      0.09%,    0                           -                      -
            4.84%           2                      0.26%,    0                           -                      -
            4.72%           2                      0.30%,    0                           -                      -
            3.91%           1                      0.29%,    0                           -                      -
            3.05%           1                      0.11%,    0                           -                      -
            2.90%           1                      0.08%,    0                           -                      -
            2.71%           1                     -0.11%,    0                           -                      -
            2.44%           1                      0.09%,    0                           -                      -
            2.35%           1                     -0.09%,    0                           -                      -
            2.27%           1                      0.15%,    0                           -                      -
            2.27%           1                      0.06%,    0                           -                      -
            2.17%           1                      0.09%,    0                           -                      -

Sometime some changes are not reflected in the source code,
e.g. changing the compiler option. So for this, we can't get
the changes by diffing the source code lines.

This patch series also introduces a new perf-diff option "--changed-func".
It passes the names of changed functions then perf-diff can know what
functions are changed.

For example,
perf diff --stream --changed-func main --changed-func rand

NOTE:
-----
1. For the patches:

  perf util: Create source line mapping table
  perf util: Create streams for managing top N hottest callchains
  perf util: Return per-event callchain streams
  perf util: Compare two streams
  perf util: Calculate the sum of all streams hits
  perf util: Report hot streams
  perf diff: Support hot streams comparison

  These patches support the hot stream comparison.

2. For the patches:
  perf util: Add new block info functions for top N hot blocks comparison
  perf util: Add new block info fmts for showing hot blocks comparison
  perf util: Enable block source line comparison
  perf diff: support hot blocks comparison

  These patches support the hot blocks comparison.

3. For the patches
  perf util: Filter out streams by name of changed functions
  perf util: Filter out blocks by name of changed functions
  perf diff: Filter out streams by changed functions

  These patches support a user specified function name list which let
  perf-diff know these functions are changed.

 v2:
 ---
 Refine the codes for following patches:
  perf util: Create source line mapping table
  perf util: Create streams for managing top N hottest callchains
  perf util: Calculate the sum of all streams hits
  perf util: Add new block info functions for top N hot blocks comparison 

Jin Yao (14):
  perf util: Create source line mapping table
  perf util: Create streams for managing top N hottest callchains
  perf util: Return per-event callchain streams
  perf util: Compare two streams
  perf util: Calculate the sum of all streams hits
  perf util: Report hot streams
  perf diff: Support hot streams comparison
  perf util: Add new block info functions for top N hot blocks
    comparison
  perf util: Add new block info fmts for showing hot blocks comparison
  perf util: Enable block source line comparison
  perf diff: support hot blocks comparison
  perf util: Filter out streams by name of changed functions
  perf util: Filter out blocks by name of changed functions
  perf diff: Filter out streams by changed functions

 tools/perf/Documentation/perf-diff.txt |  19 +
 tools/perf/builtin-diff.c              | 324 ++++++++++++---
 tools/perf/util/Build                  |   1 +
 tools/perf/util/block-info.c           | 433 ++++++++++++++++++-
 tools/perf/util/block-info.h           |  38 +-
 tools/perf/util/callchain.c            | 517 +++++++++++++++++++++++
 tools/perf/util/callchain.h            |  34 ++
 tools/perf/util/srclist.c              | 555 +++++++++++++++++++++++++
 tools/perf/util/srclist.h              |  74 ++++
 9 files changed, 1935 insertions(+), 60 deletions(-)
 create mode 100644 tools/perf/util/srclist.c
 create mode 100644 tools/perf/util/srclist.h

-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ