linux-kernel - Re: [PATCH v2] perf stat: Introduce skippable evsels

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <201a2ad6-3fb4-4b2a-d8a4-34d924e680c3@linux.intel.com>
Date:   Wed, 19 Apr 2023 10:16:33 -0400
From:   "Liang, Kan" <kan.liang@...ux.intel.com>
To:     Ian Rogers <irogers@...gle.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Florian Fischer <florian.fischer@...q.space>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] perf stat: Introduce skippable evsels



On 2023-04-19 9:19 a.m., Ian Rogers wrote:
> On Wed, Apr 19, 2023 at 5:31 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>
>>
>>
>> On 2023-04-18 9:00 p.m., Ian Rogers wrote:
>>> On Tue, Apr 18, 2023 at 5:12 PM Ian Rogers <irogers@...gle.com> wrote:
>>>>
>>>> On Tue, Apr 18, 2023 at 2:51 PM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2023-04-18 4:08 p.m., Ian Rogers wrote:
>>>>>> On Tue, Apr 18, 2023 at 11:19 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 2023-04-18 11:43 a.m., Ian Rogers wrote:
>>>>>>>> On Tue, Apr 18, 2023 at 6:03 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2023-04-17 2:13 p.m., Ian Rogers wrote:
>>>>>>>>>> The json TopdownL1 is enabled if present unconditionally for perf stat
>>>>>>>>>> default. Enabling it on Skylake has multiplexing as TopdownL1 on
>>>>>>>>>> Skylake has multiplexing unrelated to this change - at least on the
>>>>>>>>>> machine I was testing on. We can remove the metric group TopdownL1 on
>>>>>>>>>> Skylake so that we don't enable it by default, there is still the
>>>>>>>>>> group TmaL1. To me, disabling TopdownL1 seems less desirable than
>>>>>>>>>> running with multiplexing - previously to get into topdown analysis
>>>>>>>>>> there has to be knowledge that "perf stat -M TopdownL1" is the way to
>>>>>>>>>> do this.
>>>>>>>>>
>>>>>>>>> To be honest, I don't think it's a good idea to remove the TopdownL1. We
>>>>>>>>> cannot remove it just because the new way cannot handle it. The perf
>>>>>>>>> stat default works well until 6.3-rc7. It's a regression issue of the
>>>>>>>>> current perf-tools-next.
>>>>>>>>
>>>>>>>> I'm not so clear it is a regression to consistently add TopdownL1 for
>>>>>>>> all architectures supporting it.
>>>>>>>
>>>>>>>
>>>>>>> Breaking the perf stat default is a regression.
>>>>>>
>>>>>> Breaking is overstating the use of multiplexing. The impact is less
>>>>>> accuracy in the IPC and branch misses default metrics,
>>>>>
>>>>> Inaccuracy is a breakage for the default.
>>>>
>>>> Can you present a case where this matters? The events are already not
>>>> grouped and so inaccurate for metrics.
>>>
>>> Removing CPUs without perf metrics from the TopdownL1 metric group is
>>> implemented here:
>>> https://lore.kernel.org/lkml/20230419005423.343862-6-irogers@google.com/
>>> Note, this applies to pre-Icelake and atom CPUs as these also lack
>>> perf metric (aka topdown) events.
>>>
>>
>> That may give the end user the impression that the pre-Icelake doesn't
>> support the Topdown Level1 events, which is not true.
>>
>> I think perf should either keep it for all Intel platforms which
>> supports tma_L1_group, or remove the TopdownL1 name entirely for Intel
>> platform (let the end user use the tma_L1_group and the name exposed by
>> the kernel as before.).
> 
> How does this work on hybrid systems? We will enable TopdownL1 because
> of the presence of perf metric (aka topdown) events but this will also
> enable TopdownL1 on the atom core.


This is the output from a hybrid system with current 6.3-rc7.

As you can see that the Topdown L1 and L2 are displayed for the big
core. No Topdown events are displayed for the atom core.

(BTW: The 99.15% is not multiplexing. I think it's because the perf stat
may starts from the big core and it takes a little bit time to run
something on the small core.)


$perf stat ./hybrid_triad_loop.sh

 Performance counter stats for './hybrid_triad_loop.sh':

            211.80 msec task-clock                       #    0.996 CPUs
utilized
                 5      context-switches                 #   23.608 /sec
                 3      cpu-migrations                   #   14.165 /sec
               652      page-faults                      #    3.078 K/sec
       411,470,713      cpu_core/cycles/                 #    1.943 G/sec
       607,566,483      cpu_atom/cycles/                 #    2.869
G/sec                       (99.15%)
     1,613,379,362      cpu_core/instructions/           #    7.618 G/sec
     1,616,816,312      cpu_atom/instructions/           #    7.634
G/sec                       (99.15%)
       202,876,952      cpu_core/branches/               #  957.884 M/sec
       202,367,829      cpu_atom/branches/               #  955.480
M/sec                       (99.15%)
            56,740      cpu_core/branch-misses/          #  267.898 K/sec
            19,033      cpu_atom/branch-misses/          #   89.864
K/sec                       (99.15%)
     2,468,765,562      cpu_core/slots/                  #   11.656 G/sec
     1,411,184,398      cpu_core/topdown-retiring/       #     57.4%
Retiring
         4,671,159      cpu_core/topdown-bad-spec/       #      0.2% Bad
Speculation
        92,222,378      cpu_core/topdown-fe-bound/       #      3.7%
Frontend Bound
       952,516,107      cpu_core/topdown-be-bound/       #     38.7%
Backend Bound
         2,696,347      cpu_core/topdown-heavy-ops/      #      0.1%
Heavy Operations          #     57.2% Light Operations
         4,460,659      cpu_core/topdown-br-mispredict/  #      0.2%
Branch Mispredict         #      0.0% Machine Clears
        19,538,486      cpu_core/topdown-fetch-lat/      #      0.8%
Fetch Latency             #      3.0% Fetch Bandwidth
        24,170,592      cpu_core/topdown-mem-bound/      #      1.0%
Memory Bound              #     37.7% Core Bound

       0.212598999 seconds time elapsed

       0.212525000 seconds user
       0.000000000 seconds sys


> 
>>
>>> With that change I don't have a case that requires skippable evsels,
>>> and so we can take that series with patch 6 over the v1 of that series
>>> with this change.
>>>
>>
>> I'm afraid this is not the only problem the commit 94b1a603fca7 ("perf
>> stat: Add TopdownL1 metric as a default if present") in the
>> perf-tools-next branch introduced.
>>
>> The topdown L2 in the perf stat default on SPR and big core of the ADL
>> is still missed. I don't see a possible fix for this on the current
>> perf-tools-next branch.
> 
> I thought in its current state the json metrics for TopdownL2 on SPR
> have multiplexing. Given L1 is used to drill down to L2, it seems odd
> to start on L2, but given L1 is used to compute the thresholds for L2,
> this should be to have both L1 and L2 on these platforms. However,
> that doesn't work as you don't want multiplexing.
> 
> This all seems backward to avoid potential multiplexing on branch miss
> rate and IPC, just always having TopdownL1 seems cleanest with the
> skippable evsels working around the permissions issue - as put forward
> in this patch. Possibly adding L2 metrics on ADL/SPR, but only once
> the multiplexing issue is resolved.
> 

No, not just that issue. Based to what I tested these days, perf stat
default has issues/regressions on most of the Intel platforms with the
current perf-tools-next and perf/core branch of acme's repo.

For the pre-ICL platforms:
- The permission issue. (This patch tried to address.)
- Unclean perf stat default. (This patch failed to address.)
  Unnecessary multiplexing for cycles.
  Display partial of the TopdownL1

https://lore.kernel.org/lkml/d1fe801a-22d0-1f9b-b127-227b21635bd5@linux.intel.com/

For SPR platforms
- Topdown L2 metrics is missed, while it works with the current 6.3-rc7.

For ADL/RPL platforms
- Segmentation fault which I just found this morning.
# ./perf stat true
Segmentation fault (core dumped)


After the test on a hybrid machine, I incline to revert the commit
94b1a603fca7 ("perf stat: Add TopdownL1 metric as a default if present")
and related patches for now.

To clarify, I do not object a generic solution for the Topdown on
different ARCHs. But the current generic solution aka TopdownL1 has all
kinds of problems on most of Intel platforms. We should fix them first
before applying to the mainline.

Thanks,
Kan