[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <369fd454-d94d-daa1-ead4-b42645ec4282@arm.com>
Date: Fri, 25 Jun 2021 14:25:15 +0100
From: James Clark <james.clark@....com>
To: Leo Yan <leo.yan@...aro.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
John Garry <john.garry@...wei.com>,
Will Deacon <will@...nel.org>,
Mathieu Poirier <mathieu.poirier@...aro.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Dave Martin <Dave.Martin@....com>, Al Grant <Al.Grant@....com>,
linux-arm-kernel@...ts.infradead.org,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 5/5] perf arm-spe: Don't wait for PERF_RECORD_EXIT
event
On 19/05/2021 08:19, Leo Yan wrote:
> When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
> perf event) for processing trace data, which is needless and even might
> cause logic error, e.g. it might fail to correlate perf events with Arm
> SPE events correctly.
>
> So this patch removes the condition checking for PERF_RECORD_EXIT event.
>
> Signed-off-by: Leo Yan <leo.yan@...aro.org>
> ---
> tools/perf/util/arm-spe.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
> index 5c5b438584c4..58b7069c5a5f 100644
> --- a/tools/perf/util/arm-spe.c
> +++ b/tools/perf/util/arm-spe.c
> @@ -717,11 +717,7 @@ static int arm_spe_process_event(struct perf_session *session,
> sample->time);
> }
> } else if (timestamp) {
> - if (event->header.type == PERF_RECORD_EXIT) {
> - err = arm_spe_process_queues(spe, timestamp);
> - if (err)
> - return err;
> - }
> + err = arm_spe_process_queues(spe, timestamp);
> }
>
> return err;
>
For the whole set:
Reviewed-by: James Clark <james.clark@....com>
Tested-by: James Clark <james.clark@....com>
I see a big improvement in decoding involving multiple processes because the timestamps are now
correlated with the comm and mmap events.
For example perf-exec samples are visible right before the exec is done, and on an
application that forks, samples are visible from all processes. For example:
perf record -e arm_spe// -- bash -c "stress -c 1"
perf script
perf-exec 4502 [003] 259755.050409: 1 l1d-access: ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
perf-exec 4502 [003] 259755.050409: 1 tlb-access: ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
perf-exec 4502 [003] 259755.050409: 1 memory: ffff80001014b840 sched_clock+0x40 ([kernel.kallsyms])
perf-exec 4502 [003] 259755.050411: 1 tlb-access: ffff800010120fb8 __rcu_read_lock+0x0 ([kernel.kallsyms])
bash 4502 [003] 259755.050411: 1 branch-miss: ffff8000105b2a40 memcpy+0x80 ([kernel.kallsyms])
bash 4502 [003] 259755.050411: 1 tlb-access: 0 [unknown] ([unknown])
...
stress 4502 [003] 259755.051468: 1 l1d-access: ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
stress 4502 [003] 259755.051468: 1 tlb-access: ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
stress 4502 [003] 259755.051468: 1 memory: ffff800010259a24 __vma_adjust+0x1f4 ([kernel.kallsyms])
Previously samples were only attributed to 'stress', which was obviously wrong.
James
Powered by blists - more mailing lists