[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <945e58b5-5012-45a8-933a-c1a192fd006e@linaro.org>
Date: Tue, 23 Jul 2024 16:26:40 +0100
From: James Clark <james.clark@...aro.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Andi Kleen <ak@...ux.intel.com>, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, Athira Rajeev <atrajeev@...ux.vnet.ibm.com>
Subject: Re: [PATCH v1 2/2] perf script: Fix for `perf script +F metric` with
leader sampling
On 23/07/2024 3:57 pm, Ian Rogers wrote:
> On Tue, Jul 23, 2024 at 7:41 AM James Clark <james.clark@...aro.org> wrote:
>>
>>
>>
>> On 20/07/2024 8:45 am, Ian Rogers wrote:
>>> Andi Kleen reported a regression where `perf script +F metric` would
>>> crash. With this change the output is:
>>>
>>> ```
>>> $ perf record -a -e '{cycles,instructions}:S' perf bench mem memcpy
>>>
>>> 21.229620 GB/sec
>>>
>>> 15.751008 GB/sec
>>>
>>> 16.009221 GB/sec
>>> [ perf record: Woken up 1 times to write data ]
>>> [ perf record: Captured and wrote 1.945 MB perf.data (294 samples) ]
>>> $ perf --no-pager script -F +metric
>>> perf 1912464 [000] 814503.473101: 6325 cycles: ffffffff8548d64a native_write_msr+0xa ([kernel.kallsyms])
>>> perf 1912464 [000] 814503.473101: metric: 0.06 insn per cycle
>>> perf 1912464 [000] 814503.473101: 351 instructions: ffffffff8548d64a native_write_msr+0xa ([kernel.kallsyms])
>>> perf 1912464 [000] 814503.473101: metric: 0.03 insn per cycle
>>> ...
>>> ```
>>
>> For some reason I only get the metric: lines when I record with -a. I
>> noticed this because Andi's test doesn't use -a so it fails.
>>
>> I'm not sure if that's expected or it's related to your disclaimer below?
>
> It is. When you don't do -a the cpu map just contains -1 and for some
> reason it is busted. The whole indirections to arrays of arrays,
> counts, stats, aggregations, with indices into various other arrays
> and a lack of helpers. The code works for perf stat, but there is a
> lot of complexity that I don't fully grok in that. Here I've tried to
> kind of break down what the code is trying to do in the comments, but
> the old code never did sample_read_group__for_each so was is broken
> with leader sampling? Is the leader sampling pretending the read
> counts are periods and calling process sample multiple times. Andi
> likely knows this code better than me so I was hoping he could fix it
> up. We may want to take the patches anyway in order to not have a
> segv.
>
> Thanks,
> Ian
>
Yeah I suppose it's strictly better now without the segfault. Could you
pull in the test and update it to add -a? At least then that behavior
will be locked down and we can extend it later without -a.
I also tested Andi's V5 and still got the segfault.
>>>
>>> The change fixes perf script to update counts and thereby aggregate
>>> values which then get consumed by unchanged metric logic in the shadow
>>> stat output. Note, it would be preferential to switch to json metrics.
>>>
>>> Reported-by: Andi Kleen <ak@...ux.intel.com>
>>> Closes: https://lore.kernel.org/linux-perf-users/20240713155443.1665378-1-ak@linux.intel.com/
>>> Fixes: 37cc8ad77cf8 ("perf metric: Directly use counts rather than saved_value")'
>>> Signed-off-by: Ian Rogers <irogers@...gle.com>
>>> ---
>>> The code isn't well tested nor does it support non-leader sampling
>>> reading of counts based on periods that seemed to present in the
>>> previous code. Sending out for the sake of discussion. Andi's changes
>>> added a test and that should certainly be added.
>>> ---
>>> tools/perf/builtin-script.c | 114 +++++++++++++++++++++++++++++-------
>>> 1 file changed, 93 insertions(+), 21 deletions(-)
>>>
Powered by blists - more mailing lists