[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Y9AlVmA+dZLm2uwi@kernel.org>
Date: Tue, 24 Jan 2023 15:37:10 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: James Clark <james.clark@....com>
Cc: linux-perf-users@...r.kernel.or, linux-kernel@...r.kernel.org,
leo.yan@...aro.com, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
linux-perf-users@...r.kernel.org
Subject: Re: [PATCH] perf mem/c2c: Document that SPE is used for mem and c2c
on Arm
Em Tue, Jan 24, 2023 at 02:59:29PM +0000, James Clark escreveu:
> Setup is non-trivial so also link to the full SPE docs.
Thanks, applied.
- Arnaldo
> Signed-off-by: James Clark <james.clark@....com>
> ---
> tools/perf/Documentation/perf-c2c.txt | 8 ++++++--
> tools/perf/Documentation/perf-mem.txt | 7 ++++++-
> 2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt
> index af5c3106f468..4e8c263e1721 100644
> --- a/tools/perf/Documentation/perf-c2c.txt
> +++ b/tools/perf/Documentation/perf-c2c.txt
> @@ -22,7 +22,11 @@ you to track down the cacheline contentions.
> On Intel, the tool is based on load latency and precise store facility events
> provided by Intel CPUs. On PowerPC, the tool uses random instruction sampling
> with thresholding feature. On AMD, the tool uses IBS op pmu (due to hardware
> -limitations, perf c2c is not supported on Zen3 cpus).
> +limitations, perf c2c is not supported on Zen3 cpus). On Arm64 it uses SPE to
> +sample load and store operations, therefore hardware and kernel support is
> +required. See linkperf:perf-arm-spe[1] for a setup guide. Due to the
> +statistical nature of Arm SPE sampling, not every memory operation will be
> +sampled.
>
> These events provide:
> - memory address of the access
> @@ -333,4 +337,4 @@ Check Joe's blog on c2c tool for detailed use case explanation:
>
> SEE ALSO
> --------
> -linkperf:perf-record[1], linkperf:perf-mem[1]
> +linkperf:perf-record[1], linkperf:perf-mem[1], linkperf:perf-arm-spe[1]
> diff --git a/tools/perf/Documentation/perf-mem.txt b/tools/perf/Documentation/perf-mem.txt
> index 005c95580b1e..19862572e3f2 100644
> --- a/tools/perf/Documentation/perf-mem.txt
> +++ b/tools/perf/Documentation/perf-mem.txt
> @@ -23,6 +23,11 @@ Note that on Intel systems the memory latency reported is the use-latency,
> not the pure load (or store latency). Use latency includes any pipeline
> queueing delays in addition to the memory subsystem latency.
>
> +On Arm64 this uses SPE to sample load and store operations, therefore hardware
> +and kernel support is required. See linkperf:perf-arm-spe[1] for a setup guide.
> +Due to the statistical nature of SPE sampling, not every memory operation will
> +be sampled.
> +
> OPTIONS
> -------
> <command>...::
> @@ -93,4 +98,4 @@ all perf record options.
>
> SEE ALSO
> --------
> -linkperf:perf-record[1], linkperf:perf-report[1]
> +linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-arm-spe[1]
>
> base-commit: 5670ebf54bd26482f57a094c53bdc562c106e0a9
> --
> 2.39.1
>
--
- Arnaldo
Powered by blists - more mailing lists