[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fU7RNzvzxBcAQy3RT9Ge3YtqPhDonupNWS7Wgb8HGQkGg@mail.gmail.com>
Date: Tue, 17 Dec 2024 16:54:19 -0800
From: Ian Rogers <irogers@...gle.com>
To: James Clark <james.clark@...aro.org>
Cc: linux-arm-kernel@...ts.infradead.org, linux-perf-users@...r.kernel.org,
Will Deacon <will@...nel.org>, Mark Rutland <mark.rutland@....com>,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, "Liang, Kan" <kan.liang@...ux.intel.com>,
John Garry <john.g.garry@...cle.com>, Mike Leach <mike.leach@...aro.org>,
Leo Yan <leo.yan@...ux.dev>, Graham Woodward <graham.woodward@....com>,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: [PATCH 5/5] perf docs: arm_spe: Document new discard mode
On Tue, Dec 17, 2024 at 3:56 AM James Clark <james.clark@...aro.org> wrote:
>
> Document the flag, hint what it's used for and give an example with
> other useful options to get minimal output.
>
> Signed-off-by: James Clark <james.clark@...aro.org>
> ---
> tools/perf/Documentation/perf-arm-spe.txt | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-arm-spe.txt b/tools/perf/Documentation/perf-arm-spe.txt
> index de2b0b479249..588eead438bc 100644
> --- a/tools/perf/Documentation/perf-arm-spe.txt
> +++ b/tools/perf/Documentation/perf-arm-spe.txt
> @@ -150,6 +150,7 @@ arm_spe/load_filter=1,min_latency=10/'
> pct_enable=1 - collect physical timestamp instead of virtual timestamp (PMSCR.PCT) - requires privilege
> store_filter=1 - collect stores only (PMSFCR.ST)
> ts_enable=1 - enable timestamping with value of generic timer (PMSCR.TS)
> + discard=1 - enable SPE PMU events but don't collect sample data - see 'Discard mode' (PMBLIMITR.FM = DISCARD)
>
> +++*+++ Latency is the total latency from the point at which sampling started on that instruction, rather
> than only the execution latency.
> @@ -220,6 +221,16 @@ Common errors
>
> Increase sampling interval (see above)
>
> +Discard mode
> +~~~~~~~~~~~~
> +
> +SPE PMU events can be used without the overhead of collecting sample data if
> +discard mode is supported (optional from Armv8.6). First run a system wide SPE
> +session (or on the core of interest) using options to minimize output. Then run
> +perf stat:
> +
> + perf record -e arm_spe/discard/ -a -N -B --no-bpf-event -o - > /dev/null &
> + perf stat -e SAMPLE_FEED_LD
Perhaps clarify this should be an ARM SPE event? It seems strange to
have one perf command affect a later one, the purpose of things like
event multiplexing is to hide the hardware limits. I'd prefer if the
last bit was like:
```
Then run perf stat with an SPE event on the same PMU:
perf record -e arm_spe/discard/ -a -N -B --no-bpf-event -o - > /dev/null &
perf stat -e arm_spe/SAMPLE_FEED_LD/
``
Thanks,
Ian
Powered by blists - more mailing lists