[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210103225219.GA850408@krava>
Date: Sun, 3 Jan 2021 23:52:19 +0100
From: Jiri Olsa <jolsa@...hat.com>
To: Leo Yan <leo.yan@...aro.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Namhyung Kim <namhyung@...nel.org>,
Andi Kleen <ak@...ux.intel.com>,
Ian Rogers <irogers@...gle.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Joe Mario <jmario@...hat.com>, David Ahern <dsahern@...il.com>,
Don Zickus <dzickus@...hat.com>, Al Grant <Al.Grant@....com>,
James Clark <james.clark@....com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 00/11] perf c2c: Sort cacheline with all loads
On Sun, Dec 13, 2020 at 01:38:39PM +0000, Leo Yan wrote:
> This patch set is to sort cache line for all load operations which hit
> any cache levels. For single cache line view, it shows the load
> references for loads with cache hits and with cache misses respectively.
>
> This series is a following for the old patch set "perf c2c: Sort
> cacheline with LLC load" [1], in the old patch set it tries to sort
> cache line with the load operations in last level cache (LLC), after
> testing we found the trace data doesn't contain LLC events if the
> platform isn't a NUMA system. For this reason, this series refines the
> implementation to sort on all cache levels hits of load operations; it's
> reasonable for us to review the load and store opreations, if detects
> any cache line is accessed by multi-threads, this hints that the cache
> line is possible for false sharing.
>
> This patch set is clearly applied on perf/core branch with the latest
> commit db0ea13cc741 ("perf evlist: Use the right prefix for 'struct
> evlist' record methods"). And the changes has been tested on x86 and
> Arm64, the testing result is shown as below.
SNIP
>
>
> [...]
>
> Changes from v1:
> * Changed from sorting on LLC to sorting on all loads with cache hits;
> * Added patches 06/11, 07/11 for refactoring macros;
> * Added patch 08/11 for refactoring node header, so can display "%loads"
> rather than "%hitms" in the header;
> * Added patch 09/11 to add local pointers for pointing to output metrics
> string and sort string (Juri);
> * Added warning in percent_hitm() for the display "all", which should
> never happen (Juri).
>
> [1] https://lore.kernel.org/patchwork/cover/1321514/
>
>
> Leo Yan (11):
> perf c2c: Add dimensions for total load hit
> perf c2c: Add dimensions for load hit
> perf c2c: Add dimensions for load miss
> perf c2c: Rename for shared cache line stats
> perf c2c: Refactor hist entry validation
> perf c2c: Refactor display filter macro
> perf c2c: Refactor node display macro
> perf c2c: Refactor node header
> perf c2c: Add local variables for output metrics
> perf c2c: Sort on all cache hit for load operations
> perf c2c: Update documentation for display option 'all'
>
> tools/perf/Documentation/perf-c2c.txt | 21 +-
> tools/perf/builtin-c2c.c | 548 ++++++++++++++++++++++----
> 2 files changed, 487 insertions(+), 82 deletions(-)
Joe might want to test it first, but it looks all good to me:
Acked-by: Jiri Olsa <jolsa@...hat.com>
thanks,
jirka
Powered by blists - more mailing lists