linux-kernel - Re: [PATCH v4 00/44] Fix perf on Intel hybrid CPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fUc7mcEurwCBxZ_Dx=1pARynyra5Psi3J_OVk9JCXYMcA@mail.gmail.com>
Date:   Mon, 15 May 2023 15:49:45 -0700
From:   Ian Rogers <irogers@...gle.com>
To:     "Liang, Kan" <kan.liang@...ux.intel.com>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Ahmad Yasin <ahmad.yasin@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Stephane Eranian <eranian@...gle.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Perry Taylor <perry.taylor@...el.com>,
        Samantha Alt <samantha.alt@...el.com>,
        Caleb Biggers <caleb.biggers@...el.com>,
        Weilin Wang <weilin.wang@...el.com>,
        Edward Baker <edward.baker@...el.com>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Florian Fischer <florian.fischer@...q.space>,
        Rob Herring <robh@...nel.org>,
        John Garry <john.g.garry@...cle.com>,
        Kajol Jain <kjain@...ux.ibm.com>,
        Sumanth Korikkar <sumanthk@...ux.ibm.com>,
        Thomas Richter <tmricht@...ux.ibm.com>,
        Tiezhu Yang <yangtiezhu@...ngson.cn>,
        Ravi Bangoria <ravi.bangoria@....com>,
        Leo Yan <leo.yan@...aro.org>,
        Yang Jihong <yangjihong1@...wei.com>,
        James Clark <james.clark@....com>,
        Suzuki Poulouse <suzuki.poulose@....com>,
        Kang Minchul <tegongkang@...il.com>,
        Athira Rajeev <atrajeev@...ux.vnet.ibm.com>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 00/44] Fix perf on Intel hybrid CPUs

On Sun, May 14, 2023 at 5:03 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>
>
>
> On 2023-05-12 2:33 p.m., Arnaldo Carvalho de Melo wrote:
> > Em Wed, May 03, 2023 at 04:56:36PM -0400, Liang, Kan escreveu:
> >>
> >>
> >> On 2023-05-02 6:38 p.m., Ian Rogers wrote:
> >>> TL;DR: hybrid doesn't crash, json metrics work on hybrid on both PMUs
> >>> or individually, event parsing doesn't always scan all PMUs, more and
> >>> new tests that also run without hybrid, less code.
> >>>
> >>> The first 4 patches are aimed at Linux 6.4 to address issues raised,
> >>> in particular by Kan, on the existing perf stat behavior with json
> >>> metrics. They avoid duplicated events by removing groups. They don't
> >>> hide events and metrics to make event multiplexing obvious. They avoid
> >>> terminating perf when paranoia is higher due to certain events that
> >>> always fail. They avoid rearranging events by PMUs when the events
> >>> aren't in a group.
> >>>
> >>> The next 5 patches avoid grouping events for metrics where they could
> >>> never succeed and were previously posted as:
> >>> "perf vendor events intel: Add xxx metric constraints"
> >>> https://lore.kernel.org/all/20230419005423.343862-1-irogers@google.com/
> >>> In general the generated json is coming from:
> >>> https://github.com/intel/perfmon/pull/73
> >>>
> >>> Next are some general and test improvements.
> >>>
> >>> Next event parsing is rewritten to not scan all PMUs for the benefit
> >>> of raw and legacy cache parsing, instead these are handled by the
> >>> lexer and a new term type. This ultimately removes the need for the
> >>> event parser for hybrid to be recursive as legacy cache can be just a
> >>> term. Tests are re-enabled for events with hyphens, so AMD's
> >>> branch-brs event is now parsable.
> >>>
> >>> The cputype option is made a generic pmu filter flag and is tested
> >>> even on non-hybrid systems.
> >>>
> >>> The final patches address specific json metric issues on hybrid, in
> >>> both the json metrics and the metric code.
> >>>
> >>> The patches add slightly more code than they remove, in areas like
> >>> better json metric constraints and tests, but in the core util code,
> >>> the removal of hybrid is a net reduction:
> >>>  22 files changed, 711 insertions(+), 1016 deletions(-)
> >>>
> >>> Sample output is contained in the v1 patch set:
> >>> https://lore.kernel.org/lkml/bff481ba-e60a-763f-0aa0-3ee53302c480@linux.intel.com/
> >>>
> >>> Tested on Tigerlake, Skylake and Alderlake CPUs.
> >>>
> >>> The v4 patch set:
> >>>  - rebase, 1 of the Linux 6.4 recommended patches are merged leaving:
> >>>    1) perf metric: Change divide by zero and !support events behavior
> >>>    2) perf stat: Introduce skippable evsels
> >>>    3) perf metric: Json flag to not group events if gathering a metric group
> >>>    4) perf parse-events: Don't reorder ungrouped events by pmu
> >>>    whose diffstat is:
> >>>     30 files changed, 326 insertions(+), 33 deletions(-)
> >>>    but without the vendor event updates (the tend to be large as they
> >>>    repeat something per architecture per metric) is just:
> >>>     10 files changed, 90 insertions(+), 32 deletions(-)
> >>
> >> I have tested the 4 patches on top of the perf-tools-next branch on both
> >> Cascade Lake and Raptor Lake. The result looks good to me.
> >>
> >> They address the permission error found in the default mode of perf stat
> >> on the Cascade Lake. Thanks Ian for the fix.
> >>
> >> Arnaldo, could you please consider to back port them for the 6.4?
> >
> > Yes, its in perf-tools now, will go to Linus next week.
>
> Thanks Arnaldo!
>
> >
> > What about the other patches? I saw some you provided your review, what
> > about the others, are you ok with them?
> >
>
> Yes, I'm OK with the patch set. It fixes many issues. Thanks Ian.
> (My tests mainly focus on the area in which the patch set may touch. I
> did the tests on various platforms, ADL (hybrid), Cascade Lake, SPR.)
>
> Tested-by: Kan Liang <kan.liang@...ux.intel.com>
>
> But there are still some issues. I don't think they are introduced by
> this patch set. We may fix them later separately.
>
> - Segmentation fault with perf stat --topdown on ADL (hybrid) and
> Cascade Lake.
>   It looks like a legacy issue, may not be introduced by this patch set.
>   Here is the backtrace. It looks like there is a NULL metric_group.
>
> (gdb) backtrace
> #0  0x00007ffff73035d1 in __strstr_sse2_unaligned () from /lib64/libc.so.6
> #1  0x00000000004f9019 in metricgroup__topdown_max_level_callback
> (pm=<optimized out>, table=<optimized out>,
>     data=0x7fffffff92f4) at util/metricgroup.c:1722
> #2  0x00000000005e8a31 in pmu_metrics_table_for_each_metric
> (table=0xcb74d0 <pmu_events_map+368>,
>     fn=fn@...ry=0x4f8ff0 <metricgroup__topdown_max_level_callback>,
> data=data@...ry=0x7fffffff92f4)
>     at pmu-events/pmu-events.c:61123
> #3  0x00000000004fbc3b in metricgroups__topdown_max_level () at
> util/metricgroup.c:1742
> #4  0x000000000042c135 in add_default_attributes () at builtin-stat.c:1845
> #5  cmd_stat (argc=0, argv=0x7fffffffe3e0) at builtin-stat.c:2446
> #6  0x00000000004b922b in run_builtin (p=p@...ry=0xd5c530
> <commands+336>, argc=argc@...ry=2,
>     argv=argv@...ry=0x7fffffffe3e0) at perf.c:323
> #7  0x000000000040e373 in handle_internal_command (argv=0x7fffffffe3e0,
> argc=2) at perf.c:377
> #8  run_argv (argv=<synthetic pointer>, argcp=<synthetic pointer>) at
> perf.c:421
> #9  main (argc=2, argv=0x7fffffffe3e0) at perf.c:537
> (gdb)

Thanks, there's a 1-liner for the segv:
https://lore.kernel.org/lkml/20230515224530.671331-1-irogers@google.com/
Arnaldo, can we make sure this goes toward the next 6.4 release candidate?

Thanks,
Ian

> Also, the return type is unsigned int, but a bool is given.
>
> unsigned int metricgroups__topdown_max_level(void)
> {
>         unsigned int max_level = 0;
>         const struct pmu_metrics_table *table = pmu_metrics_table__find();
>
>         if (!table)
>                 return false;
>
>
>
> - The perf metric and metricgroups fail on different platforms.
>   Ian and I have discussed it. We agree to address it later separately.
>
>   102: perf all metricgroups test
>   ADL (hybrid)
>   103: perf all metrics test
>   ADL (hybrid), Cascade Lake, SPR
>
> - perf list: The [Kernel PMU event] is missed for all the hardware cache
> events.
>   It impacts both hybrid and non-hybrid platforms.
>   It's a user-visible change introduced by the patch set.
>   I don't know if anyone cares whether it's a kernel event or a regular
> event. It doesn't bother me. So I'm OK with it.
>
> cpu:
>   L1-dcache-loads OR cpu/L1-dcache-loads/
>   L1-dcache-load-misses OR cpu/L1-dcache-load-misses/
>   L1-dcache-stores OR cpu/L1-dcache-stores/
>   L1-icache-load-misses OR cpu/L1-icache-load-misses/
>   LLC-loads OR cpu/LLC-loads/
>   LLC-load-misses OR cpu/LLC-load-misses/
>   LLC-stores OR cpu/LLC-stores/
>   LLC-store-misses OR cpu/LLC-store-misses/
>   dTLB-loads OR cpu/dTLB-loads/
>   dTLB-load-misses OR cpu/dTLB-load-misses/
>   dTLB-stores OR cpu/dTLB-stores/
>   dTLB-store-misses OR cpu/dTLB-store-misses/
>   iTLB-load-misses OR cpu/iTLB-load-misses/
>   branch-loads OR cpu/branch-loads/
>   branch-load-misses OR cpu/branch-load-misses/
>   node-loads OR cpu/node-loads/
>   node-load-misses OR cpu/node-load-misses/
>   branch-instructions OR cpu/branch-instructions/    [Kernel PMU event]
>   branch-misses OR cpu/branch-misses/                [Kernel PMU event]
>   bus-cycles OR cpu/bus-cycles/                      [Kernel PMU event]
>   cache-misses OR cpu/cache-misses/                  [Kernel PMU event]
>   cache-references OR cpu/cache-references/          [Kernel PMU event]
>   cpu-cycles OR cpu/cpu-cycles/                      [Kernel PMU event]
>   instructions OR cpu/instructions/                  [Kernel PMU event]
>
>
> - The --cputype only works for the metric in the default mode.
>   I can still see the cpu_atom events with --cputype core
>   It may be something we can improve later.
>
> # ./perf stat --cputype core sleep 2
>
>  Performance counter stats for 'sleep 2':
>
>               0.52 msec task-clock                       #    0.000 CPUs
> utilized
>                  1      context-switches                 #    1.939 K/sec
>                  0      cpu-migrations                   #    0.000 /sec
>                 69      page-faults                      #  133.770 K/sec
>          2,569,423      cpu_core/cycles/                 #    4.981 G/sec
>      <not counted>      cpu_atom/cycles/
>                        (0.00%)
>          3,287,691      cpu_core/instructions/           #    6.374 G/sec
>      <not counted>      cpu_atom/instructions/
>                        (0.00%)
>            555,848      cpu_core/branches/               #    1.078 G/sec
>      <not counted>      cpu_atom/branches/
>                        (0.00%)
>              8,398      cpu_core/branch-misses/          #   16.281 M/sec
>      <not counted>      cpu_atom/branch-misses/
>                        (0.00%)
>         15,416,538      cpu_core/TOPDOWN.SLOTS/          #     36.1 %
> tma_backend_bound
>                                                   #     23.9 %  tma_retiring
>                                                   #      5.6 %
> tma_bad_speculation
>                                                   #     34.4 %
> tma_frontend_bound
>          3,687,877      cpu_core/topdown-retiring/
>            846,398      cpu_core/topdown-bad-spec/
>          5,320,217      cpu_core/topdown-fe-bound/
>          5,562,045      cpu_core/topdown-be-bound/
>             14,149      cpu_core/INT_MISC.UOP_DROPPING/  #   27.431 M/sec
>
>
> Thanks,
> Kan