[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b376ad7e-020e-40a2-b26d-f70184753a50@arm.com>
Date: Fri, 13 Sep 2024 12:17:56 +0100
From: Leo Yan <leo.yan@....com>
To: James Clark <james.clark@...aro.org>, linux-perf-users@...r.kernel.org,
gankulkarni@...amperecomputing.com, coresight@...ts.linaro.org,
scclevenger@...amperecomputing.com
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
"Liang, Kan" <kan.liang@...ux.intel.com>,
Suzuki K Poulose <suzuki.poulose@....com>, Mike Leach
<mike.leach@...aro.org>, John Garry <john.g.garry@...cle.com>,
Will Deacon <will@...nel.org>, Leo Yan <leo.yan@...ux.dev>,
Ben Gainey <ben.gainey@....com>, Ruidong Tian
<tianruidong@...ux.alibaba.com>, Benjamin Gray <bgray@...ux.ibm.com>,
Mathieu Poirier <mathieu.poirier@...aro.org>, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v2 1/7] perf cs-etm: Don't flush when packet_queue fills
up
On 9/12/24 16:11, James Clark wrote:>
> cs_etm__flush(), like cs_etm__sample() is an operation that generates a
> sample and then swaps the current with the previous packet. Calling
> flush after processing the queues results in two swaps which corrupts
> the next sample. Therefore it wasn't appropriate to call flush here so
> remove it.
>
> Flushing is still done on a discontinuity to explicitly clear the last
> branch buffer, but when the packet_queue fills up before reaching a
> timestamp, that's not a discontinuity and the call to
> cs_etm__process_traceid_queue() already generated samples and drained
> the buffers correctly.
>
> This is visible by looking for a branch that has the same target as the
> previous branch and the following source is before the address of the
> last target, which is impossible as execution would have had to have
> gone backwards:
>
> ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94
> (packet_queue fills here before a timestamp, resulting in a flush and
> branch target ffff80008011cadc is duplicated.)
> ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff80008011cadc update_sg_lb_stats+0x94
> ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34
>
> After removing the flush the correct branch target is used for the
> second sample, and ffff8000801117c4 is no longer before the previous
> address:
>
> ffff800080849d40 _find_next_and_bit+0x78 => ffff80008011cadc update_sg_lb_stats+0x94
> ffff80008011cb1c update_sg_lb_stats+0xd4 => ffff8000801117a0 cpu_util+0x0
> ffff8000801117c4 cpu_util+0x24 => ffff8000801117d4 cpu_util+0x34
>
> Make sure that a final branch stack is output at the end of the trace
> by calling cs_etm__end_block(). This is already done for both the
> timeless decode paths.
It is right to call cs_etm__flush() for only discontinuity packet and use
cs_etm__end_block() for flushing the end of data block. Thanks for
distinguishing these two different things.
> Fixes: 21fe8dc1191a ("perf cs-etm: Add support for CPU-wide trace scenarios")
> Reported-by: Ganapatrao Kulkarni <gankulkarni@...amperecomputing.com>
> Closes: https://lore.kernel.org/all/20240719092619.274730-1-gankulkarni@os.amperecomputing.com/
> Signed-off-by: James Clark <james.clark@...aro.org>
Reviewed-by: Leo Yan <leo.yan@....com>
> ---
> tools/perf/util/cs-etm.c | 25 ++++++++++++++++++-------
> 1 file changed, 18 insertions(+), 7 deletions(-)
>
> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
> index 90f32f327b9b..242788ac9625 100644
> --- a/tools/perf/util/cs-etm.c
> +++ b/tools/perf/util/cs-etm.c
> @@ -2490,12 +2490,6 @@ static void cs_etm__clear_all_traceid_queues(struct cs_etm_queue *etmq)
>
> /* Ignore return value */
> cs_etm__process_traceid_queue(etmq, tidq);
> -
> - /*
> - * Generate an instruction sample with the remaining
> - * branchstack entries.
> - */
> - cs_etm__flush(etmq, tidq);
> }
> }
>
> @@ -2638,7 +2632,7 @@ static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm)
>
> while (1) {
> if (!etm->heap.heap_cnt)
> - goto out;
> + break;
>
> /* Take the entry at the top of the min heap */
> cs_queue_nr = etm->heap.heap_array[0].queue_nr;
> @@ -2721,6 +2715,23 @@ static int cs_etm__process_timestamped_queues(struct cs_etm_auxtrace *etm)
> ret = auxtrace_heap__add(&etm->heap, cs_queue_nr, cs_timestamp);
> }
>
> + for (i = 0; i < etm->queues.nr_queues; i++) {
> + struct int_node *inode;
> +
> + etmq = etm->queues.queue_array[i].priv;
> + if (!etmq)
> + continue;
> +
> + intlist__for_each_entry(inode, etmq->traceid_queues_list) {
> + int idx = (int)(intptr_t)inode->priv;
> +
> + /* Flush any remaining branch stack entries */
> + tidq = etmq->traceid_queues[idx];
> + ret = cs_etm__end_block(etmq, tidq);
> + if (ret)
> + return ret;
> + }
> + }
> out:
> return ret;
> }
> --
> 2.34.1
>
Powered by blists - more mailing lists