[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260119111509.GD1286628@e132581.arm.com>
Date: Mon, 19 Jan 2026 11:15:09 +0000
From: Leo Yan <leo.yan@....com>
To: James Clark <james.clark@...aro.org>
Cc: Suzuki K Poulose <suzuki.poulose@....com>,
Mike Leach <mike.leach@...aro.org>,
John Garry <john.g.garry@...cle.com>, Will Deacon <will@...nel.org>,
Leo Yan <leo.yan@...ux.dev>, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Thomas Falcon <thomas.falcon@...el.com>, coresight@...ts.linaro.org,
linux-arm-kernel@...ts.infradead.org,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] perf cs-etm: Fix decoding for sparse CPU maps
On Mon, Jan 19, 2026 at 10:18:35AM +0000, Coresight ML wrote:
> The ETM decoder incorrectly assumed that auxtrace queue indices were
> equivalent to CPU number. This assumption is used for inserting records
> into the queue, and for fetching queues when given a CPU number. This
> assumption held when Perf always opened a dummy event on every CPU, even
> if the user provided a subset of CPUs on the commandline, resulting in
> the indices aligning.
>
> For example:
>
> # event : name = cs_etm//u, , id = { 2451, 2452 }, type = 11 (cs_etm), size = 136, config = 0x4010, { sample_period, samp>
> # event : name = dummy:u, , id = { 2453, 2454, 2455, 2456 }, type = 1 (PERF_TYPE_SOFTWARE), size = 136, config = 0x9 (PER>
>
> 0 0 0x200 [0xd0]: PERF_RECORD_ID_INDEX nr: 6
> ... id: 2451 idx: 2 cpu: 2 tid: -1
> ... id: 2452 idx: 3 cpu: 3 tid: -1
> ... id: 2453 idx: 0 cpu: 0 tid: -1
> ... id: 2454 idx: 1 cpu: 1 tid: -1
> ... id: 2455 idx: 2 cpu: 2 tid: -1
> ... id: 2456 idx: 3 cpu: 3 tid: -1
>
> Since commit 811082e4b668 ("perf parse-events: Support user CPUs mixed
> with threads/processes") the dummy event no longer behaves in this way,
> making the ETM event indices start from 0 on the first CPU recorded
> regardless of its ID:
>
> # event : name = cs_etm//u, , id = { 771, 772 }, type = 11 (cs_etm), size = 144, config = 0x4010, { sample_period, sample>
> # event : name = dummy:u, , id = { 773, 774 }, type = 1 (PERF_TYPE_SOFTWARE), size = 144, config = 0x9 (PERF_COUNT_SW_DUM>
>
> 0 0 0x200 [0x90]: PERF_RECORD_ID_INDEX nr: 4
> ... id: 771 idx: 0 cpu: 2 tid: -1
> ... id: 772 idx: 1 cpu: 3 tid: -1
> ... id: 773 idx: 0 cpu: 2 tid: -1
> ... id: 774 idx: 1 cpu: 3 tid: -1
Seems to me that this patch works around the issue by using the CPU ID
instead, but event->auxtrace.idx is broken.
Should we store the correct index in event->auxtrace.idx (e.g., in the
__perf_event__synthesize_id_index()) ?
Thanks,
Leo
> This causes the following segfault when decoding:
>
> $ perf record -e cs_etm//u -C 2,3 -- true
> $ perf report
>
> perf: Segmentation fault
> -------- backtrace --------
> #0 0xaaaabf9fd020 in ui__signal_backtrace setup.c:110
> #1 0xffffab5c7930 in __kernel_rt_sigreturn [vdso][930]
> #2 0xaaaabfb68d30 in cs_etm_decoder__reset cs-etm-decoder.c:85
> #3 0xaaaabfb65930 in cs_etm__get_data_block cs-etm.c:2032
> #4 0xaaaabfb666fc in cs_etm__run_per_cpu_timeless_decoder cs-etm.c:2551
> #5 0xaaaabfb6692c in (cs_etm__process_timeless_queues cs-etm.c:2612
> #6 0xaaaabfb63390 in cs_etm__flush_events cs-etm.c:921
> #7 0xaaaabfb324c0 in auxtrace__flush_events auxtrace.c:2915
> #8 0xaaaabfaac378 in __perf_session__process_events session.c:2285
> #9 0xaaaabfaacc9c in perf_session__process_events session.c:2442
> #10 0xaaaabf8d3d90 in __cmd_report builtin-report.c:1085
> #11 0xaaaabf8d6944 in cmd_report builtin-report.c:1866
> #12 0xaaaabf95ebfc in run_builtin perf.c:351
> #13 0xaaaabf95eeb0 in handle_internal_command perf.c:404
> #14 0xaaaabf95f068 in run_argv perf.c:451
> #15 0xaaaabf95f390 in main perf.c:558
> #16 0xffffaab97400 in __libc_start_call_main libc_start_call_main.h:74
> #17 0xffffaab974d8 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
> #18 0xaaaabf8aa8f0 in _start perf[7a8f0]
>
> Fix it by inserting into the queues based on CPU number, rather than
> using the index.
>
> Fixes: 811082e4b668 ("perf parse-events: Support user CPUs mixed with threads/processes")
> Signed-off-by: James Clark <james.clark@...aro.org>
> ---
> tools/perf/util/cs-etm.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 25d56e0f1c07..12b55c2bc2ca 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -3086,7 +3086,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
>
> if (aux_offset >= auxtrace_event->offset &&
> aux_offset + aux_size <= auxtrace_event->offset + auxtrace_event->size) {
> - struct cs_etm_queue *etmq = etm->queues.queue_array[auxtrace_event->idx].priv;
> + struct cs_etm_queue *etmq = cs_etm__get_queue(etm, auxtrace_event->cpu);
>
> /*
> * If this AUX event was inside this buffer somewhere, create a new auxtrace event
> @@ -3095,6 +3095,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
> auxtrace_fragment.auxtrace = *auxtrace_event;
> auxtrace_fragment.auxtrace.size = aux_size;
> auxtrace_fragment.auxtrace.offset = aux_offset;
> + auxtrace_fragment.auxtrace.idx = etmq->queue_nr;
> file_offset += aux_offset - auxtrace_event->offset + auxtrace_event->header.size;
>
> pr_debug3("CS ETM: Queue buffer size: %#"PRI_lx64" offset: %#"PRI_lx64
>
> --
> 2.34.1
>
> _______________________________________________
> CoreSight mailing list -- coresight@...ts.linaro.org
> To unsubscribe send an email to coresight-leave@...ts.linaro.org
Powered by blists - more mailing lists