linux-kernel - Re: [PATCH 3/7] perf cs-etm: Save aux records in each etm queue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210227071056.GA3317@leoy-ThinkPad-X240s>
Date:   Sat, 27 Feb 2021 15:10:56 +0800
From:   Leo Yan <leo.yan@...aro.org>
To:     James Clark <james.clark@....com>
Cc:     coresight@...ts.linaro.org, al.grant@....com,
        branislav.rankov@....com, denik@...omium.org,
        suzuki.poulose@....com, Mike Leach <mike.leach@...aro.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        John Garry <john.garry@...wei.com>,
        Will Deacon <will@...nel.org>,
        Mathieu Poirier <mathieu.poirier@...aro.org>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/7] perf cs-etm: Save aux records in each etm queue

On Fri, Feb 12, 2021 at 04:45:09PM +0200, James Clark wrote:
> The aux records will be used set the bounds of decoding in a
> later commit. In the future we may also want to use the flags
> of each record to control decoding.
> 
> Do these need to be saved in their entirety, or can pointers
> to each record safely be saved instead for later access?

Rather than introudcing the perf record list, I just wander if we can
use easier method to fix this problem.  So below is the rough idea
(though I don't really verify it):

The essential information we need is what's the valid buffer length can
be used for decoding.  Though cs_etm_queue::buf_len tracks the buffer
length, but it's the buffer length is for the whole AUX buffer, and
which belongs to multiple "PERF_RECORD_AUX" events.  So we cannot decode
at once for the whole trace data in the AUX trace buffer, on the other
hand, the incoming "PERF_RECORD_AUX" event can guide the CoreSight
decoder it should decode how much buffer size.  At the end, the trace
data can be decoded step by step based on the incoming "PERF_RECORD_AUX"
events.

I'd like to propose to add a new field "cs_etm_queue::buf_rec_len", it
stands for the record length based on the RECORD_AUX event.  In
theory, this value should be always less than "cs_etm_queue::buf_len".

When every time the "PERF_RECORD_AUX" event is coming, we find out the
corresponding queue (so this can be applied for "1:1" or "N:1" models
for source and sink), and accumulate "perf_record_aux::aux_size" into
"cs_etm_queue::buf_rec_len".

At the decoder side, it decreases "etmq->buf_rec_len" until to zero for
the current round of decoding (see cs_etm__decode_data_block()).  Since
all the "PERF_RECORD_AUX" event will be processed before
"PERF_RECORD_EXIT" event, so we don't worry the tail trace data will be
ignored.

The main reason for this suggestion is it don't need to change the
significant logic in current code.  I will try to do experiment for this
idea and share back.

James, if you think I miss anything, please correct me as needed.
Thanks!

Leo

> Signed-off-by: James Clark <james.clark@....com>
> ---
>  tools/perf/util/cs-etm.c | 32 +++++++++++++++++++++++++++++---
>  1 file changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 8f8b448632fb..88b541b2a804 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -92,12 +92,16 @@ struct cs_etm_queue {
>  	/* Conversion between traceID and index in traceid_queues array */
>  	struct intlist *traceid_queues_list;
>  	struct cs_etm_traceid_queue **traceid_queues;
> +	int aux_record_list_len;
> +	int aux_record_list_idx;
> +	struct perf_record_aux *aux_record_list;
>  };
>  
>  /* RB tree for quick conversion between traceID and metadata pointers */
>  static struct intlist *traceid_list;
>  
> -static int cs_etm__update_queues(struct cs_etm_auxtrace *etm, int cpu);
> +static int cs_etm__update_queues(struct cs_etm_auxtrace *etm, int cpu,
> +				 struct perf_record_aux *aux_record);
>  static int cs_etm__process_queues(struct cs_etm_auxtrace *etm);
>  static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
>  					   pid_t tid);
> @@ -585,6 +589,7 @@ static void cs_etm__free_queue(void *priv)
>  
>  	cs_etm_decoder__free(etmq->decoder);
>  	cs_etm__free_traceid_queues(etmq);
> +	free(etmq->aux_record_list);
>  	free(etmq);
>  }
>  
> @@ -759,6 +764,19 @@ static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm)
>  	return NULL;
>  }
>  
> +static int cs_etm__save_aux_record(struct cs_etm_queue *etmq,
> +				   struct perf_record_aux *aux_record)
> +{
> +	etmq->aux_record_list = reallocarray(etmq->aux_record_list,
> +					      etmq->aux_record_list_len+1,
> +					      sizeof(*etmq->aux_record_list));
> +	if (!etmq->aux_record_list)
> +		return -ENOMEM;
> +
> +	etmq->aux_record_list[etmq->aux_record_list_len++] = *aux_record;
> +	return 0;
> +}
> +
>  static int cs_etm__search_first_timestamp(struct cs_etm_queue *etmq)
>  {
>  	int ret = 0;
> @@ -865,7 +883,7 @@ static int cs_etm__setup_queues(struct cs_etm_auxtrace *etm)
>  	return 0;
>  }
>  
> -static int cs_etm__update_queues(struct cs_etm_auxtrace *etm, int cpu)
> +static int cs_etm__update_queues(struct cs_etm_auxtrace *etm, int cpu, struct perf_record_aux *aux)
>  {
>  	int ret;
>  	if (etm->queues.new_data) {
> @@ -875,6 +893,14 @@ static int cs_etm__update_queues(struct cs_etm_auxtrace *etm, int cpu)
>  			return ret;
>  	}
>  
> +	/* In timeless mode, cpu is set to -1, and a single aux buffer is filled */
> +	if (cpu < 0)
> +		cpu = 0;
> +
> +	ret = cs_etm__save_aux_record(etm->queues.queue_array[cpu].priv, aux);
> +	if (ret)
> +		return ret;
> +
>  	if (!etm->timeless_decoding)
>  		return cs_etm__search_first_timestamp(etm->queues.queue_array[cpu].priv);
>  	else
> @@ -2357,7 +2383,7 @@ static int cs_etm__process_event(struct perf_session *session,
>  
>  	if ((timestamp || etm->timeless_decoding)
>  			&& event->header.type == PERF_RECORD_AUX) {
> -		err = cs_etm__update_queues(etm, sample->cpu);
> +		err = cs_etm__update_queues(etm, sample->cpu, &event->aux);
>  		if (err)
>  			return err;
>  	}
> -- 
> 2.28.0
>