linux-kernel - Re: [PATCH v2 1/2] perf cs-etm: Fix decoding for sparse CPU maps

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <dd9a3cde-ea9f-4c53-b537-f15767bf4d99@linaro.org>
Date: Mon, 19 Jan 2026 15:16:01 +0000
From: James Clark <james.clark@...aro.org>
To: Leo Yan <leo.yan@....com>
Cc: Suzuki K Poulose <suzuki.poulose@....com>,
 Mike Leach <mike.leach@...aro.org>, John Garry <john.g.garry@...cle.com>,
 Will Deacon <will@...nel.org>, Leo Yan <leo.yan@...ux.dev>,
 Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
 Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>,
 Adrian Hunter <adrian.hunter@...el.com>,
 Thomas Falcon <thomas.falcon@...el.com>, coresight@...ts.linaro.org,
 linux-arm-kernel@...ts.infradead.org, linux-perf-users@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] perf cs-etm: Fix decoding for sparse CPU maps



On 19/01/2026 2:53 pm, Leo Yan wrote:
> On Mon, Jan 19, 2026 at 11:55:53AM +0000, James Clark wrote:
> 
> [...]
> 
>>>>     0 0 0x200 [0x90]: PERF_RECORD_ID_INDEX nr: 4
>>>>     ... id: 771  idx: 0  cpu: 2  tid: -1
>>>>     ... id: 772  idx: 1  cpu: 3  tid: -1
>>>>     ... id: 773  idx: 0  cpu: 2  tid: -1
>>>>     ... id: 774  idx: 1  cpu: 3  tid: -1
>>>
>>> Seems to me that this patch works around the issue by using the CPU ID
>>> instead, but event->auxtrace.idx is broken.
>>
>> I don't think it's a workaround, I think it should have been written this
>> way in the first place.
> 
> Agree.  Sorry for confusion in my rely.
> 
> [...]
> 
>>>> @@ -3086,7 +3086,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
>>>>    	if (aux_offset >= auxtrace_event->offset &&
>>>>    	    aux_offset + aux_size <= auxtrace_event->offset + auxtrace_event->size) {
>>>> -		struct cs_etm_queue *etmq = etm->queues.queue_array[auxtrace_event->idx].priv;
>>>> +		struct cs_etm_queue *etmq = cs_etm__get_queue(etm, auxtrace_event->cpu);
> 
> During decoding, cs_etm allocates its own AUX queues and needs to
> convert the AUX records into a cs_etm-specific format.
> 
> For example:
> 
>    event_id=405 idx=0 cpu=0 tid=6673
>    event_id=406 idx=1 cpu=1 tid=6673
>    event_id=407 idx=2 cpu=3 tid=6673
>    event_id=408 idx=3 cpu=4 tid=6673
>    event_id=409 idx=4 cpu=5 tid=6673
>    `            `      `> CPU ID
>     `> Event ID  `> Generic buffer IDs
> 
> In this case, I deliberately hotplugged off CPU2.  As a result, the
> buffer IDs are consecutive from 0 to 4, while the CPU ID for CPU2 is
> missing.  When adding AUX records to the cs_etm queues, we need to
> convert from the generic buffer index to the corresponding CPU ID.
> 
> So the above change makes sense to me.
> 
>>>>    		/*
>>>>    		 * If this AUX event was inside this buffer somewhere, create a new auxtrace event
>>>> @@ -3095,6 +3095,7 @@ static int cs_etm__queue_aux_fragment(struct perf_session *session, off_t file_o
>>>>    		auxtrace_fragment.auxtrace = *auxtrace_event;
>>>>    		auxtrace_fragment.auxtrace.size = aux_size;
>>>>    		auxtrace_fragment.auxtrace.offset = aux_offset;
>>>> +		auxtrace_fragment.auxtrace.idx = etmq->queue_nr;
> 
> I am still confused about this.  Because above auxtrace will be queued
> on cs_etm queues, should we convert from the generic buffer index to the
> cs_etm specific one?  E.g.,
> 
>    auxtrace_fragment.auxtrace.idx = auxtrace_event->cpu;
> 

This is basically what the change is already doing. etmq->queue_nr == 
CPU because cs_etm__setup_queue() is called for every CPU up to max_cpu, 
not only for ones that were recorded so it has a 1:1 mapping. Using 
etmq->queue_nr also works in per-thread mode where queue 0 is used, 
which your suggestion doesn't handle. Otherwise they would be equivalent.

To be honest the whole decoder has become hacks on top of hacks. I think 
we might want to do a full from scratch re-write when we come to do the 
proper timestamp to MMAP event correlation or the change to not keep 
resetting the decoder for TRBE wrap mode.

> BTW, if my understanding above is valid, it is good to go through the
> cs_etm.c file for the "idx <-> CPU ID" conversion.
> 

Do you mean there are other mistakes? I did check the rest and did some 
testing and didn't see any issues.

> Thanks,
> Leo
> 
>>>>    		file_offset += aux_offset - auxtrace_event->offset + auxtrace_event->header.size;
>>>>    		pr_debug3("CS ETM: Queue buffer size: %#"PRI_lx64" offset: %#"PRI_lx64
>>>>
>>>> -- 
>>>> 2.34.1
>>>>
>>>> _______________________________________________
>>>> CoreSight mailing list -- coresight@...ts.linaro.org
>>>> To unsubscribe send an email to coresight-leave@...ts.linaro.org
>>