lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 13 Oct 2020 12:25:09 +0800
From:   "Jin, Yao" <yao.jin@...ux.intel.com>
To:     acme@...nel.org, jolsa@...nel.org, peterz@...radead.org,
        mingo@...hat.com, alexander.shishkin@...ux.intel.com
Cc:     Linux-kernel@...r.kernel.org, ak@...ux.intel.com,
        kan.liang@...el.com, yao.jin@...el.com
Subject: Re: [PATCH v8 0/7] perf: Stream comparison

Hi Jiri, Hi Arnaldo,

How about v8 series? V6 got ACK from Jiri and I updated the series to v8 according to Arnaldo's 
comments. Please let me know if there are still some issues for this version then I can continue 
improving the patchset.

Thanks
Jin Yao

On 10/9/2020 10:28 AM, Jin Yao wrote:
> Sometimes, a small change in a hot function reducing the cycles of
> this function, but the overall workload doesn't get faster. It is
> interesting where the cycles are moved to.
> 
> What it would like is to diff before/after streams. The stream is the
> branch history which is aggregated by the branch records from perf
> samples. For example, the callchains aggregated from the branch records.
> By browsing the hot stream, we can understand the hot code path.
> 
> By browsing the hot streams, we can understand the hot code path.
> By comparing the cycles variation of same streams between old perf
> data and new perf data, we can understand if the cycles are moved
> to other codes.
> 
> The before stream is the stream in perf.data.old. The after stream
> is the stream in perf.data.
> 
> Diffing before/after streams compares top N hottest streams between
> two perf data files.
> 
> If all entries of one stream in perf.data.old are fully matched with
> all entries of another stream in perf.data, we think two streams
> are matched, otherwise the streams are not matched.
> 
> For example,
> 
>     cycles: 1, hits: 26.80%                 cycles: 1, hits: 27.30%
> --------------------------              --------------------------
>               main div.c:39                           main div.c:39
>               main div.c:44                           main div.c:44
> 
> The above streams are matched and we can see for the same streams the
> cycles (1) are equal and the callchain hit percents are slightly changed
> (26.80% vs. 27.30%). That's expected.
> 
> Now let's see example.
> 
> perf record -b ...      Generate perf.data.old with branch data
> perf record -b ...      Generate perf.data with branch data
> perf diff --stream
> 
> [ Matched hot streams ]
> 
> hot chain pair 1:
>              cycles: 1, hits: 27.77%                  cycles: 1, hits: 9.24%
>          ---------------------------              --------------------------
>                        main div.c:39                           main div.c:39
>                        main div.c:44                           main div.c:44
> 
> hot chain pair 2:
>             cycles: 34, hits: 20.06%                cycles: 27, hits: 16.98%
>          ---------------------------              --------------------------
>            __random_r random_r.c:360               __random_r random_r.c:360
>            __random_r random_r.c:388               __random_r random_r.c:388
>            __random_r random_r.c:388               __random_r random_r.c:388
>            __random_r random_r.c:380               __random_r random_r.c:380
>            __random_r random_r.c:357               __random_r random_r.c:357
>                __random random.c:293                   __random random.c:293
>                __random random.c:293                   __random random.c:293
>                __random random.c:291                   __random random.c:291
>                __random random.c:291                   __random random.c:291
>                __random random.c:291                   __random random.c:291
>                __random random.c:288                   __random random.c:288
>                       rand rand.c:27                          rand rand.c:27
>                       rand rand.c:26                          rand rand.c:26
>                             rand@plt                                rand@plt
>                             rand@plt                                rand@plt
>                compute_flag div.c:25                   compute_flag div.c:25
>                compute_flag div.c:22                   compute_flag div.c:22
>                        main div.c:40                           main div.c:40
>                        main div.c:40                           main div.c:40
>                        main div.c:39                           main div.c:39
> 
> hot chain pair 3:
>               cycles: 9, hits: 4.48%                  cycles: 6, hits: 4.51%
>          ---------------------------              --------------------------
>            __random_r random_r.c:360               __random_r random_r.c:360
>            __random_r random_r.c:388               __random_r random_r.c:388
>            __random_r random_r.c:388               __random_r random_r.c:388
>            __random_r random_r.c:380               __random_r random_r.c:380
> 
> [ Hot streams in old perf data only ]
> 
> hot chain 1:
>              cycles: 18, hits: 6.75%
>           --------------------------
>            __random_r random_r.c:360
>            __random_r random_r.c:388
>            __random_r random_r.c:388
>            __random_r random_r.c:380
>            __random_r random_r.c:357
>                __random random.c:293
>                __random random.c:293
>                __random random.c:291
>                __random random.c:291
>                __random random.c:291
>                __random random.c:288
>                       rand rand.c:27
>                       rand rand.c:26
>                             rand@plt
>                             rand@plt
>                compute_flag div.c:25
>                compute_flag div.c:22
>                        main div.c:40
> 
> hot chain 2:
>              cycles: 29, hits: 2.78%
>           --------------------------
>                compute_flag div.c:22
>                        main div.c:40
>                        main div.c:40
>                        main div.c:39
> 
> [ Hot streams in new perf data only ]
> 
> hot chain 1:
>                                                       cycles: 4, hits: 4.54%
>                                                   --------------------------
>                                                                main div.c:42
>                                                        compute_flag div.c:28
> 
> hot chain 2:
>                                                       cycles: 5, hits: 3.51%
>                                                   --------------------------
>                                                                main div.c:39
>                                                                main div.c:44
>                                                                main div.c:42
>                                                        compute_flag div.c:28
>   
>   v8:
>   ---
>   Rebase to perf/core
> 
>   v7:
>   ---
>   Create a new struct evlist_streams which contains ev_streams and
>   nr_evsel, so we don't need to pass nr_evsel in stream related functions.
> 
>   Rename functions for better coding style.
> 
>   v6:
>   ---
>   Rebase to perf/core
> 
>   v5:
>   ---
>   1. Remove enum stream_type
>   2. Rebase to perf/core
> 
>   v4:
>   ---
>   The previous version is too big and very hard for review.
> 
>   1. v4 removes the code which supports the source line mapping
>      table and remove the source line based comparison. Now we
>      only supports the basic functionality of stream comparison.
> 
>   2. Refactor the code in a generic way.
> 
>   v3:
>   ---
>   v2 has 14 patches, it's hard to review.
>   v3 is only 7 patches for basic stream comparison.
> 
> Jin Yao (7):
>    perf util: Create streams
>    perf util: Get the evsel_streams by evsel_idx
>    perf util: Compare two streams
>    perf util: Link stream pair
>    perf util: Calculate the sum of total streams hits
>    perf util: Report hot streams
>    perf diff: Support hot streams comparison
> 
>   tools/perf/Documentation/perf-diff.txt |   4 +
>   tools/perf/builtin-diff.c              | 119 ++++++++-
>   tools/perf/util/Build                  |   1 +
>   tools/perf/util/callchain.c            |  99 +++++++
>   tools/perf/util/callchain.h            |   9 +
>   tools/perf/util/stream.c               | 342 +++++++++++++++++++++++++
>   tools/perf/util/stream.h               |  41 +++
>   7 files changed, 602 insertions(+), 13 deletions(-)
>   create mode 100644 tools/perf/util/stream.c
>   create mode 100644 tools/perf/util/stream.h
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ