lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 9 Jun 2022 18:41:19 +0800
From:   Xing Zhengjun <zhengjun.xing@...ux.intel.com>
To:     Namhyung Kim <namhyung@...il.com>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, alexander.shishkin@...el.com,
        Jiri Olsa <jolsa@...hat.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        linux-perf-users <linux-perf-users@...r.kernel.org>,
        Ian Rogers <irogers@...gle.com>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Kan Liang <kan.liang@...ux.intel.com>
Subject: Re: [PATCH 5/5] perf stat: Add topdown metrics in the default perf
 stat on the hybrid machine



On 6/9/2022 8:09 AM, Namhyung Kim wrote:
> On Tue, Jun 7, 2022 at 1:08 AM <zhengjun.xing@...ux.intel.com> wrote:
>>
>> From: Zhengjun Xing <zhengjun.xing@...ux.intel.com>
>>
>> Topdown metrics are missed in the default perf stat on the hybrid machine,
>> add Topdown metrics in default perf stat for hybrid systems.
>>
>> Currently, we support the perf metrics Topdown for the p-core PMU in the
>> perf stat default, the perf metrics Topdown support for e-core PMU will be
>> implemented later separately. Refactor the code adds two x86 specific
>> functions. Widen the size of the event name column by 7 chars, so that all
>> metrics after the "#" become aligned again.
>>
>> The perf metrics topdown feature is supported on the cpu_core of ADL. The
>> dedicated perf metrics counter and the fixed counter 3 are used for the
>> topdown events. Adding the topdown metrics doesn't trigger multiplexing.
>>
>> Before:
>>
>>   # ./perf  stat  -a true
>>
>>   Performance counter stats for 'system wide':
>>
>>               53.70 msec cpu-clock                 #   25.736 CPUs utilized
>>                  80      context-switches          #    1.490 K/sec
>>                  24      cpu-migrations            #  446.951 /sec
>>                  52      page-faults               #  968.394 /sec
>>           2,788,555      cpu_core/cycles/          #   51.931 M/sec
>>             851,129      cpu_atom/cycles/          #   15.851 M/sec
>>           2,974,030      cpu_core/instructions/    #   55.385 M/sec
>>             416,919      cpu_atom/instructions/    #    7.764 M/sec
>>             586,136      cpu_core/branches/        #   10.916 M/sec
>>              79,872      cpu_atom/branches/        #    1.487 M/sec
>>              14,220      cpu_core/branch-misses/   #  264.819 K/sec
>>               7,691      cpu_atom/branch-misses/   #  143.229 K/sec
>>
>>         0.002086438 seconds time elapsed
>>
>> After:
>>
>>   # ./perf stat  -a true
>>
>>   Performance counter stats for 'system wide':
>>
>>               61.39 msec cpu-clock                        #   24.874 CPUs utilized
>>                  76      context-switches                 #    1.238 K/sec
>>                  24      cpu-migrations                   #  390.968 /sec
>>                  52      page-faults                      #  847.097 /sec
>>           2,753,695      cpu_core/cycles/                 #   44.859 M/sec
>>             903,899      cpu_atom/cycles/                 #   14.725 M/sec
>>           2,927,529      cpu_core/instructions/           #   47.690 M/sec
>>             428,498      cpu_atom/instructions/           #    6.980 M/sec
>>             581,299      cpu_core/branches/               #    9.470 M/sec
>>              83,409      cpu_atom/branches/               #    1.359 M/sec
>>              13,641      cpu_core/branch-misses/          #  222.216 K/sec
>>               8,008      cpu_atom/branch-misses/          #  130.453 K/sec
>>          14,761,308      cpu_core/slots/                  #  240.466 M/sec
>>           3,288,625      cpu_core/topdown-retiring/       #     22.3% retiring
>>           1,323,323      cpu_core/topdown-bad-spec/       #      9.0% bad speculation
>>           5,477,470      cpu_core/topdown-fe-bound/       #     37.1% frontend bound
>>           4,679,199      cpu_core/topdown-be-bound/       #     31.7% backend bound
>>             646,194      cpu_core/topdown-heavy-ops/      #      4.4% heavy operations       #     17.9% light operations
>>           1,244,999      cpu_core/topdown-br-mispredict/  #      8.4% branch mispredict      #      0.5% machine clears
>>           3,891,800      cpu_core/topdown-fetch-lat/      #     26.4% fetch latency          #     10.7% fetch bandwidth
>>           1,879,034      cpu_core/topdown-mem-bound/      #     12.7% memory bound           #     19.0% Core bound
>>
>>         0.002467839 seconds time elapsed
>>
>> Signed-off-by: Zhengjun Xing <zhengjun.xing@...ux.intel.com>
>> Reviewed-by: Kan Liang <kan.liang@...ux.intel.com>
>> ---
> [SNIP]
>> +const char *arch_get_topdown_pmu_name(struct evlist *evlist, bool warn)
>> +{
>> +       const char *pmu_name = "cpu";
>> +
>> +       if (perf_pmu__has_hybrid()) {
>> +               if (!evlist->hybrid_pmu_name) {
>> +                       if (warn)
>> +                               pr_warning
>> +                                   ("WARNING: default to use cpu_core topdown events\n");
>> +                       evlist->hybrid_pmu_name =
>> +                           perf_pmu__hybrid_type_to_pmu("core");
> 
> This doesn't look good.  Please consider reducing the
> indent level like returning early as
> 
>      if (!perf_pmu__has_hybrid())
>          return "cpu";
> 
>      if (!evlist->hybrid_pmu_name) {
>          ...
> 
Thanks for the comments, I will update it in the next version.
> Thanks,
> Namhyung
> 
> 
>> +               }
>> +
>> +               pmu_name = evlist->hybrid_pmu_name;
>> +       }
>> +       return pmu_name;
>> +}

-- 
Zhengjun Xing

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ