lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 3 Jan 2021 23:52:19 +0100
From:   Jiri Olsa <>
To:     Leo Yan <>
Cc:     Arnaldo Carvalho de Melo <>,
        Peter Zijlstra <>,
        Ingo Molnar <>,
        Mark Rutland <>,
        Alexander Shishkin <>,
        Namhyung Kim <>,
        Andi Kleen <>,
        Ian Rogers <>,
        Kan Liang <>,
        Joe Mario <>, David Ahern <>,
        Don Zickus <>, Al Grant <>,
        James Clark <>,
Subject: Re: [PATCH v2 00/11] perf c2c: Sort cacheline with all loads

On Sun, Dec 13, 2020 at 01:38:39PM +0000, Leo Yan wrote:
> This patch set is to sort cache line for all load operations which hit
> any cache levels.  For single cache line view, it shows the load
> references for loads with cache hits and with cache misses respectively.
> This series is a following for the old patch set "perf c2c: Sort
> cacheline with LLC load" [1], in the old patch set it tries to sort
> cache line with the load operations in last level cache (LLC), after
> testing we found the trace data doesn't contain LLC events if the
> platform isn't a NUMA system.  For this reason, this series refines the
> implementation to sort on all cache levels hits of load operations; it's
> reasonable for us to review the load and store opreations, if detects
> any cache line is accessed by multi-threads, this hints that the cache
> line is possible for false sharing.
> This patch set is clearly applied on perf/core branch with the latest
> commit db0ea13cc741 ("perf evlist: Use the right prefix for 'struct
> evlist' record methods").  And the changes has been tested on x86 and
> Arm64, the testing result is shown as below.


>   [...]
> Changes from v1:
> * Changed from sorting on LLC to sorting on all loads with cache hits;
> * Added patches 06/11, 07/11 for refactoring macros;
> * Added patch 08/11 for refactoring node header, so can display "%loads"
>   rather than "%hitms" in the header;
> * Added patch 09/11 to add local pointers for pointing to output metrics
>   string and sort string (Juri);
> * Added warning in percent_hitm() for the display "all", which should
>   never happen (Juri).
> [1]
> Leo Yan (11):
>   perf c2c: Add dimensions for total load hit
>   perf c2c: Add dimensions for load hit
>   perf c2c: Add dimensions for load miss
>   perf c2c: Rename for shared cache line stats
>   perf c2c: Refactor hist entry validation
>   perf c2c: Refactor display filter macro
>   perf c2c: Refactor node display macro
>   perf c2c: Refactor node header
>   perf c2c: Add local variables for output metrics
>   perf c2c: Sort on all cache hit for load operations
>   perf c2c: Update documentation for display option 'all'
>  tools/perf/Documentation/perf-c2c.txt |  21 +-
>  tools/perf/builtin-c2c.c              | 548 ++++++++++++++++++++++----
>  2 files changed, 487 insertions(+), 82 deletions(-)

Joe might want to test it first, but it looks all good to me:

Acked-by: Jiri Olsa <>


Powered by blists - more mailing lists