[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b9ff0a13-7a77-4cb3-b8c0-e5fdf2d86e87@linaro.org>
Date: Tue, 27 Aug 2024 10:13:21 +0100
From: James Clark <james.clark@...aro.org>
To: "Liang, Kan" <kan.liang@...ux.intel.com>, Ian Rogers <irogers@...gle.com>
Cc: linux-perf-users@...r.kernel.org, John Garry <john.g.garry@...cle.com>,
Will Deacon <will@...nel.org>, Mike Leach <mike.leach@...aro.org>,
Leo Yan <leo.yan@...ux.dev>, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
Weilin Wang <weilin.wang@...el.com>,
Athira Rajeev <atrajeev@...ux.vnet.ibm.com>,
Dominique Martinet <asmadeus@...ewreck.org>,
Yang Jihong <yangjihong@...edance.com>,
Colin Ian King <colin.i.king@...il.com>, Andi Kleen <ak@...ux.intel.com>,
Ze Gao <zegao2021@...il.com>, Jing Zhang <renyu.zj@...ux.alibaba.com>,
Sun Haiyong <sunhaiyong@...ngson.cn>, Yicong Yang
<yangyicong@...ilicon.com>, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/7] Event parsing fixes
On 22/08/2024 4:18 pm, Liang, Kan wrote:
>
>
> On 2024-08-22 11:10 a.m., Ian Rogers wrote:
>> On Thu, Aug 22, 2024 at 7:32 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>>
>>>
>>>
>>> On 2024-08-22 9:24 a.m., James Clark wrote:
>>>> I rebased this one and made some other fixes so that I could test it,
>>>> so I thought I'd repost it here in case it's helpful. I also added a
>>>> new test.
>>>>
>>>> But for the testing it all looks ok.
>>>>
>>>> There is one small difference where it now shows "stalled-cycles-..."
>>>> as <not supported> events, when before it just didn't show them at all when
>>>> they weren't supported:
>>>>
>>>> $ perf stat -- true
>>>>
>>>> Performance counter stats for 'true':
>>>>
>>>> 0.66 msec task-clock # 0.384 CPUs utilized
>>>> 0 context-switches # 0.000 /sec
>>>> 0 cpu-migrations # 0.000 /sec
>>>> 52 page-faults # 78.999 K/sec
>>>> <not counted> cpu_atom/instructions/ (0.00%)
>>>> 978,399 cpu_core/instructions/ # 1.02 insn per cycle
>>>> <not counted> cpu_atom/cycles/ (0.00%)
>>>> 959,722 cpu_core/cycles/ # 1.458 GHz
>>>> <not supported> cpu_atom/stalled-cycles-frontend/
>>>> <not supported> cpu_core/stalled-cycles-frontend/
>>>>
>>>
>>> Intel didn't support the events for a very long time. It would impact
>>> many existing generations and all future generations.
>>> The current method is to hide the non-exist events. The TopdownL1 is an
>>> example. If it doesn't exist in the json file, perf stat will not
>>> display it.
>>> I don't think it's a good idea to disclose non-exist events in the perf
>>> stat default.
>>>
>>> The <not supported> doesn't help here, since there could be many reasons
>>> that the perf tool fails to open a counter. It just provides a
>>> misleading message for an event that never existed.
>>
>> The list of "default" events, not metrics, similarly has "<not
>> supported>" in many configurations with "-dd" or "-ddd" on AMD. I'm
>> not sure the set of default events, at different detail levels, is
>> necessarily the best. The default events can also be a source of
>> multiplexing, for example, showing branch miss rate alongside topdown
>> metrics. Anyway, for the "<not supported>" we should probably be able
>> to tweak should_skip_zero_counter that is in stat-display.c and tag
>> these default events as "skippable".
>
> The "skippable" should be fine as long as it's completely hidden.
>
> BTW: The stalled-cycles-backend should be similar to the
> stalled-cycles-frontend, but it isn't shown in the example. Is the
> stalled-cycles-backend event missed?
>
> Thanks,
> Kan
Sorry I should have made it clearer that I truncated the output just to
focus on the <not supported> part. The full output is below and it does
include stalled-cycles-backend.
I'll have a look at trying to hide the ones that don't exist, I think it
will look cleaner. But at the same time what it says isn't incorrect,
and it's not like we hide the lines from cores where the process didn't
run, so it doesn't look out of place with the <not counted> ones.
$ perf stat -- true
Performance counter stats for 'true':
0.42 msec task-clock # 0.439 CPUs
utilized
0 context-switches # 0.000 /sec
0 cpu-migrations # 0.000 /sec
53 page-faults # 125.592 K/sec
978,160 cpu_atom/instructions/ # 0.91 insn per
cycle
<not counted> cpu_core/instructions/ (0.00%)
1,070,525 cpu_atom/cycles/ # 2.537 GHz
<not counted> cpu_core/cycles/ (0.00%)
<not supported> cpu_atom/stalled-cycles-frontend/
<not supported> cpu_core/stalled-cycles-frontend/
<not supported> cpu_atom/stalled-cycles-backend/
<not supported> cpu_core/stalled-cycles-backend/
175,814 cpu_atom/branches/ # 416.620 M/sec
<not counted> cpu_core/branches/ (0.00%)
6,851 cpu_atom/branch-misses/ # 3.90% of all
branches
<not counted> cpu_core/branch-misses/ (0.00%)
TopdownL1 (cpu_atom) # 17.4 %
tma_bad_speculation
# 21.8 % tma_retiring
TopdownL1 (cpu_atom) # 27.5 % tma_backend_bound
# 33.3 % tma_frontend_bound
0.000960792 seconds time elapsed
0.000000000 seconds user
0.000471000 seconds sys
Powered by blists - more mailing lists