[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b6a49ed6-612f-5f53-a08c-c35bae52cfe3@linux.intel.com>
Date: Thu, 9 Jun 2022 18:41:19 +0800
From: Xing Zhengjun <zhengjun.xing@...ux.intel.com>
To: Namhyung Kim <namhyung@...il.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, alexander.shishkin@...el.com,
Jiri Olsa <jolsa@...hat.com>,
linux-kernel <linux-kernel@...r.kernel.org>,
linux-perf-users <linux-perf-users@...r.kernel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Andi Kleen <ak@...ux.intel.com>,
Kan Liang <kan.liang@...ux.intel.com>
Subject: Re: [PATCH 5/5] perf stat: Add topdown metrics in the default perf
stat on the hybrid machine
On 6/9/2022 8:09 AM, Namhyung Kim wrote:
> On Tue, Jun 7, 2022 at 1:08 AM <zhengjun.xing@...ux.intel.com> wrote:
>>
>> From: Zhengjun Xing <zhengjun.xing@...ux.intel.com>
>>
>> Topdown metrics are missed in the default perf stat on the hybrid machine,
>> add Topdown metrics in default perf stat for hybrid systems.
>>
>> Currently, we support the perf metrics Topdown for the p-core PMU in the
>> perf stat default, the perf metrics Topdown support for e-core PMU will be
>> implemented later separately. Refactor the code adds two x86 specific
>> functions. Widen the size of the event name column by 7 chars, so that all
>> metrics after the "#" become aligned again.
>>
>> The perf metrics topdown feature is supported on the cpu_core of ADL. The
>> dedicated perf metrics counter and the fixed counter 3 are used for the
>> topdown events. Adding the topdown metrics doesn't trigger multiplexing.
>>
>> Before:
>>
>> # ./perf stat -a true
>>
>> Performance counter stats for 'system wide':
>>
>> 53.70 msec cpu-clock # 25.736 CPUs utilized
>> 80 context-switches # 1.490 K/sec
>> 24 cpu-migrations # 446.951 /sec
>> 52 page-faults # 968.394 /sec
>> 2,788,555 cpu_core/cycles/ # 51.931 M/sec
>> 851,129 cpu_atom/cycles/ # 15.851 M/sec
>> 2,974,030 cpu_core/instructions/ # 55.385 M/sec
>> 416,919 cpu_atom/instructions/ # 7.764 M/sec
>> 586,136 cpu_core/branches/ # 10.916 M/sec
>> 79,872 cpu_atom/branches/ # 1.487 M/sec
>> 14,220 cpu_core/branch-misses/ # 264.819 K/sec
>> 7,691 cpu_atom/branch-misses/ # 143.229 K/sec
>>
>> 0.002086438 seconds time elapsed
>>
>> After:
>>
>> # ./perf stat -a true
>>
>> Performance counter stats for 'system wide':
>>
>> 61.39 msec cpu-clock # 24.874 CPUs utilized
>> 76 context-switches # 1.238 K/sec
>> 24 cpu-migrations # 390.968 /sec
>> 52 page-faults # 847.097 /sec
>> 2,753,695 cpu_core/cycles/ # 44.859 M/sec
>> 903,899 cpu_atom/cycles/ # 14.725 M/sec
>> 2,927,529 cpu_core/instructions/ # 47.690 M/sec
>> 428,498 cpu_atom/instructions/ # 6.980 M/sec
>> 581,299 cpu_core/branches/ # 9.470 M/sec
>> 83,409 cpu_atom/branches/ # 1.359 M/sec
>> 13,641 cpu_core/branch-misses/ # 222.216 K/sec
>> 8,008 cpu_atom/branch-misses/ # 130.453 K/sec
>> 14,761,308 cpu_core/slots/ # 240.466 M/sec
>> 3,288,625 cpu_core/topdown-retiring/ # 22.3% retiring
>> 1,323,323 cpu_core/topdown-bad-spec/ # 9.0% bad speculation
>> 5,477,470 cpu_core/topdown-fe-bound/ # 37.1% frontend bound
>> 4,679,199 cpu_core/topdown-be-bound/ # 31.7% backend bound
>> 646,194 cpu_core/topdown-heavy-ops/ # 4.4% heavy operations # 17.9% light operations
>> 1,244,999 cpu_core/topdown-br-mispredict/ # 8.4% branch mispredict # 0.5% machine clears
>> 3,891,800 cpu_core/topdown-fetch-lat/ # 26.4% fetch latency # 10.7% fetch bandwidth
>> 1,879,034 cpu_core/topdown-mem-bound/ # 12.7% memory bound # 19.0% Core bound
>>
>> 0.002467839 seconds time elapsed
>>
>> Signed-off-by: Zhengjun Xing <zhengjun.xing@...ux.intel.com>
>> Reviewed-by: Kan Liang <kan.liang@...ux.intel.com>
>> ---
> [SNIP]
>> +const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn)
>> +{
>> + const char *pmu_name = "cpu";
>> +
>> + if (perf_pmu__has_hybrid()) {
>> + if (!evlist->hybrid_pmu_name) {
>> + if (warn)
>> + pr_warning
>> + ("WARNING: default to use cpu_core topdown events\n");
>> + evlist->hybrid_pmu_name =
>> + perf_pmu__hybrid_type_to_pmu("core");
>
> This doesn't look good. Please consider reducing the
> indent level like returning early as
>
> if (!perf_pmu__has_hybrid())
> return "cpu";
>
> if (!evlist->hybrid_pmu_name) {
> ...
>
Thanks for the comments, I will update it in the next version.
> Thanks,
> Namhyung
>
>
>> + }
>> +
>> + pmu_name = evlist->hybrid_pmu_name;
>> + }
>> + return pmu_name;
>> +}
--
Zhengjun Xing
Powered by blists - more mailing lists