linux-kernel - Re: [PATCH v3 03/46] perf stat: Introduce skippable evsels

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fUtgEvgburjhE6HpazDh9dtn=DiSOwHaVnoE3N7sBynEw@mail.gmail.com>
Date:   Mon, 1 May 2023 08:29:55 -0700
From:   Ian Rogers <irogers@...gle.com>
To:     "Liang, Kan" <kan.liang@...ux.intel.com>,
        Weilin Wang <weilin.wang@...el.com>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Ahmad Yasin <ahmad.yasin@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Stephane Eranian <eranian@...gle.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Perry Taylor <perry.taylor@...el.com>,
        Samantha Alt <samantha.alt@...el.com>,
        Caleb Biggers <caleb.biggers@...el.com>,
        Edward Baker <edward.baker@...el.com>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Florian Fischer <florian.fischer@...q.space>,
        Rob Herring <robh@...nel.org>,
        Zhengjun Xing <zhengjun.xing@...ux.intel.com>,
        John Garry <john.g.garry@...cle.com>,
        Kajol Jain <kjain@...ux.ibm.com>,
        Sumanth Korikkar <sumanthk@...ux.ibm.com>,
        Thomas Richter <tmricht@...ux.ibm.com>,
        Tiezhu Yang <yangtiezhu@...ngson.cn>,
        Ravi Bangoria <ravi.bangoria@....com>,
        Leo Yan <leo.yan@...aro.org>,
        Yang Jihong <yangjihong1@...wei.com>,
        James Clark <james.clark@....com>,
        Suzuki Poulouse <suzuki.poulose@....com>,
        Kang Minchul <tegongkang@...il.com>,
        Athira Rajeev <atrajeev@...ux.vnet.ibm.com>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 03/46] perf stat: Introduce skippable evsels

On Mon, May 1, 2023 at 7:56 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>
>
>
> On 2023-04-29 1:34 a.m., Ian Rogers wrote:
> > Perf stat with no arguments will use default events and metrics. These
> > events may fail to open even with kernel and hypervisor disabled. When
> > these fail then the permissions error appears even though they were
> > implicitly selected. This is particularly a problem with the automatic
> > selection of the TopdownL1 metric group on certain architectures like
> > Skylake:
> >
> > '''
> > $ perf stat true
> > Error:
> > Access to performance monitoring and observability operations is limited.
> > Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open
> > access to performance monitoring and observability operations for processes
> > without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability.
> > More information can be found at 'Perf events and tool security' document:
> > https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
> > perf_event_paranoid setting is 2:
> >   -1: Allow use of (almost) all events by all users
> >       Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
> >> = 0: Disallow raw and ftrace function tracepoint access
> >> = 1: Disallow CPU event access
> >> = 2: Disallow kernel profiling
> > To make the adjusted perf_event_paranoid setting permanent preserve it
> > in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
> > '''
> >
> > This patch adds skippable evsels that when they fail to open won't
> > cause termination and will appear as "<not supported>" in output. The
> > TopdownL1 events, from the metric group, are marked as skippable. This
> > turns the failure above to:
> >
> > '''
> > $ perf stat perf bench internals synthesize
> > Computing performance of single threaded perf event synthesis by
> > synthesizing events on the perf process itself:
> >   Average synthesis took: 49.287 usec (+- 0.083 usec)
> >   Average num. events: 3.000 (+- 0.000)
> >   Average time per event 16.429 usec
> >   Average data synthesis took: 49.641 usec (+- 0.085 usec)
> >   Average num. events: 11.000 (+- 0.000)
> >   Average time per event 4.513 usec
> >
> >  Performance counter stats for 'perf bench internals synthesize':
> >
> >           1,222.38 msec task-clock:u                     #    0.993 CPUs utilized
> >                  0      context-switches:u               #    0.000 /sec
> >                  0      cpu-migrations:u                 #    0.000 /sec
> >                162      page-faults:u                    #  132.529 /sec
> >        774,445,184      cycles:u                         #    0.634 GHz                         (49.61%)
> >      1,640,969,811      instructions:u                   #    2.12  insn per cycle              (59.67%)
> >        302,052,148      branches:u                       #  247.102 M/sec                       (59.69%)
> >          1,807,718      branch-misses:u                  #    0.60% of all branches             (59.68%)
> >          5,218,927      CPU_CLK_UNHALTED.REF_XCLK:u      #    4.269 M/sec
> >                                                   #     17.3 %  tma_frontend_bound
> >                                                   #     56.4 %  tma_retiring
> >                                                   #      nan %  tma_backend_bound
> >                                                   #      nan %  tma_bad_speculation      (60.01%)
> >        536,580,469      IDQ_UOPS_NOT_DELIVERED.CORE:u    #  438.965 M/sec                       (60.33%)
> >    <not supported>      INT_MISC.RECOVERY_CYCLES_ANY:u
> >          5,223,936      CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE:u #    4.274 M/sec                       (40.31%)
> >        774,127,250      CPU_CLK_UNHALTED.THREAD:u        #  633.297 M/sec                       (50.34%)
> >      1,746,579,518      UOPS_RETIRED.RETIRE_SLOTS:u      #    1.429 G/sec                       (50.12%)
> >      1,940,625,702      UOPS_ISSUED.ANY:u                #    1.588 G/sec                       (49.70%)
> >
> >        1.231055525 seconds time elapsed
> >
> >        0.258327000 seconds user
> >        0.965749000 seconds sys
>
>
> Which branch is this patch series based on?
>
> I still cannot get the same output as the examples.
>
> I'm using the latest perf-tools-next (The latest commit ID is
> 5d27a645f609 ("perf tracepoint: Fix memory leak in is_valid_tracepoint()")).
> I only applied patch 2 and patch 3, since the patch 1 is already merged.
>
> It's a single socket Cascade Lake. with kernel 5.19-8.
> $ uname -r
> 5.19.8-100.fc35.x86_64
>
> As you can see, all the topdown related events are displayed twice.
>
> With root permission,
>
> $ sudo ./perf stat perf bench internals synthesize
> # Running 'internals/synthesize' benchmark:
> Computing performance of single threaded perf event synthesis by
> synthesizing events on the perf process itself:
>   Average synthesis took: 91.487 usec (+- 0.050 usec)
>   Average num. events: 47.000 (+- 0.000)
>   Average time per event 1.947 usec
>   Average data synthesis took: 97.720 usec (+- 0.059 usec)
>   Average num. events: 245.000 (+- 0.000)
>   Average time per event 0.399 usec
>
>  Performance counter stats for 'perf bench internals synthesize':
>
>           2,077.81 msec task-clock                       #    0.998 CPUs
> utilized
>                466      context-switches                 #  224.274 /sec
>                  4      cpu-migrations                   #    1.925 /sec
>                775      page-faults                      #  372.988 /sec
>      9,561,957,326      cycles                           #    4.602 GHz
>                        (31.17%)
>     24,466,854,021      instructions                     #    2.56  insn
> per cycle              (37.42%)
>      5,547,892,196      branches                         #    2.670
> G/sec                       (37.48%)
>         37,880,526      branch-misses                    #    0.68% of
> all branches             (37.52%)
>         49,576,109      CPU_CLK_UNHALTED.REF_XCLK        #   23.860 M/sec
>                                                   #     59.9 %  tma_retiring
>                                                   #      4.6 %
> tma_bad_speculation      (37.47%)
>        228,406,003      INT_MISC.RECOVERY_CYCLES_ANY     #  109.926
> M/sec                       (37.52%)
>         49,591,815      CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE #   23.867
> M/sec                       (24.99%)
>      9,553,472,893      CPU_CLK_UNHALTED.THREAD          #    4.598
> G/sec                       (31.25%)
>     22,893,372,651      UOPS_RETIRED.RETIRE_SLOTS        #   11.018
> G/sec                       (31.23%)
>     24,180,375,299      UOPS_ISSUED.ANY                  #   11.637
> G/sec                       (31.25%)
>         49,562,300      CPU_CLK_UNHALTED.REF_XCLK        #   23.853 M/sec
>                                                   #     28.1 %
> tma_frontend_bound
>                                                   #      7.2 %
> tma_backend_bound        (31.24%)
>     10,735,205,084      IDQ_UOPS_NOT_DELIVERED.CORE      #    5.167
> G/sec                       (31.30%)
>        228,798,426      INT_MISC.RECOVERY_CYCLES_ANY     #  110.115
> M/sec                       (25.04%)
>         49,559,962      CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE #   23.852
> M/sec                       (25.00%)
>      9,538,354,333      CPU_CLK_UNHALTED.THREAD          #    4.591
> G/sec                       (31.29%)
>     24,207,967,071      UOPS_ISSUED.ANY                  #   11.651
> G/sec                       (31.24%)
>
>        2.082670856 seconds time elapsed
>
>        0.812763000 seconds user
>        1.252387000 seconds sys

The events are displayed twice as there are 2 groups of events. This
is changed by:
https://lore.kernel.org/lkml/20230429053506.1962559-5-irogers@google.com/
where the events are no longer grouped.

> With non-root, nothing is counted for the topdownL1 events.
>
> $ ./perf stat perf bench internals synthesize
> # Running 'internals/synthesize' benchmark:
> Computing performance of single threaded perf event synthesis by
> synthesizing events on the perf process itself:
>   Average synthesis took: 91.852 usec (+- 0.139 usec)
>   Average num. events: 47.000 (+- 0.000)
>   Average time per event 1.954 usec
>   Average data synthesis took: 96.230 usec (+- 0.046 usec)
>   Average num. events: 245.000 (+- 0.000)
>   Average time per event 0.393 usec
>
>  Performance counter stats for 'perf bench internals synthesize':
>
>           2,051.95 msec task-clock:u                     #    0.997 CPUs
> utilized
>                  0      context-switches:u               #    0.000 /sec
>                  0      cpu-migrations:u                 #    0.000 /sec
>                765      page-faults:u                    #  372.816 /sec
>      3,601,662,523      cycles:u                         #    1.755 GHz
>                        (16.72%)
>      9,241,811,003      instructions:u                   #    2.57  insn
> per cycle              (33.43%)
>      2,238,848,485      branches:u                       #    1.091
> G/sec                       (50.06%)
>         19,966,181      branch-misses:u                  #    0.89% of
> all branches             (66.77%)
>      <not counted>      CPU_CLK_UNHALTED.REF_XCLK:u
>    <not supported>      INT_MISC.RECOVERY_CYCLES_ANY:u
>      <not counted>      CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE:u
>      <not counted>      CPU_CLK_UNHALTED.THREAD:u
>      <not counted>      UOPS_RETIRED.RETIRE_SLOTS:u
>      <not counted>      UOPS_ISSUED.ANY:u
>      <not counted>      CPU_CLK_UNHALTED.REF_XCLK:u
>      <not counted>      IDQ_UOPS_NOT_DELIVERED.CORE:u
>    <not supported>      INT_MISC.RECOVERY_CYCLES_ANY:u
>      <not counted>      CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE:u
>      <not counted>      CPU_CLK_UNHALTED.THREAD:u
>      <not counted>      UOPS_ISSUED.ANY:u
>
>        2.057691297 seconds time elapsed
>
>        0.766640000 seconds user
>        1.275170000 seconds sys

The reason nothing is counted is that all the metrics are trying to
share groups with the event INT_MISC.RECOVERY_CYCLES_ANY in them,
which means the whole group ends up being not supported. Again, the
patch above that removes the groups for TopdownL1 and TopdownL2, as
requested by Weilin, will address the issue.

Thanks,
Ian

> Thanks,
> Kan
>