lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4d39856e-396d-4a48-9ca3-2e1a574f50d7@linux.intel.com>
Date: Tue, 9 Jul 2024 12:17:54 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
 Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Kan Liang <kan.liang@...ux.intel.com>, linux-perf-users@...r.kernel.org,
 linux-kernel@...r.kernel.org, Yongwei Ma <yongwei.ma@...el.com>,
 Dapeng Mi <dapeng1.mi@...el.com>
Subject: Re: [Patch v2 3/5] perf x86/topdown: Don't move topdown metrics
 events when sorting events


On 7/8/2024 11:08 PM, Ian Rogers wrote:
> On Mon, Jul 8, 2024 at 12:40 AM Dapeng Mi <dapeng1.mi@...ux.intel.com> wrote:
>> when running below perf command, we say error is reported.
>>
>> perf record -e "{slots,instructions,topdown-retiring}:S" -vv -C0 sleep 1
>>
>> ------------------------------------------------------------
>> perf_event_attr:
>>   type                             4 (cpu)
>>   size                             168
>>   config                           0x400 (slots)
>>   sample_type                      IP|TID|TIME|READ|CPU|PERIOD|IDENTIFIER
>>   read_format                      ID|GROUP|LOST
>>   disabled                         1
>>   sample_id_all                    1
>>   exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
>> ------------------------------------------------------------
>> perf_event_attr:
>>   type                             4 (cpu)
>>   size                             168
>>   config                           0x8000 (topdown-retiring)
>>   { sample_period, sample_freq }   4000
>>   sample_type                      IP|TID|TIME|READ|CPU|PERIOD|IDENTIFIER
>>   read_format                      ID|GROUP|LOST
>>   freq                             1
>>   sample_id_all                    1
>>   exclude_guest                    1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1  cpu 0  group_fd 5  flags 0x8
>> sys_perf_event_open failed, error -22
>>
>> Error:
>> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (topdown-retiring).
>>
>> The reason of error is that the events are regrouped and
>> topdown-retiring event is moved to closely after the slots event and
>> topdown-retiring event needs to do the sampling, but Intel PMU driver
>> doesn't support to sample topdown metrics events.
>>
>> For topdown metrics events, it just requires to be in a group which has
>> slots event as leader. It doesn't require topdown metrics event must be
>> closely after slots event. Thus it's a overkill to move topdown metrics
>> event closely after slots event in events regrouping and furtherly cause
>> the above issue.
>>
>> Thus delete the code that moving topdown metrics events to fix the
>> issue.
> I think this is wrong. The topdown events may not be in a group, such
> cases can come from metrics due to grouping constraints, and so they
> must be sorted together so that they may be gathered into a group to
> avoid the perf event opens failing for ungrouped topdown events. I'm
> not understanding what these patches are trying to do, if you want to
> prioritize the event for leader sampling why not modify it to compare

Per my understanding, this change doesn't break anything. The events
regrouping can be divided into below several cases.

a. all events in a group

perf stat -e "{instructions,topdown-retiring,slots}" -C0 sleep 1
WARNING: events were regrouped to match PMUs

 Performance counter stats for 'CPU(s) 0':

        15,066,240      slots
         1,899,760      instructions
         2,126,998      topdown-retiring

       1.045783464 seconds time elapsed

In this case, slots event would be adjusted as the leader event and all
events are still in same group.

b. all events not in a group

perf stat -e "instructions,topdown-retiring,slots" -C0 sleep 1
WARNING: events were regrouped to match PMUs

 Performance counter stats for 'CPU(s) 0':

         2,045,561      instructions
        17,108,370      slots
         2,281,116      topdown-retiring

       1.045639284 seconds time elapsed

In this case, slots and topdown-retiring are placed into a group and slots
is the group leader. instructions event is outside the group.

c. slots event in group but topdown metric events outside the group

perf stat -e "{instructions,slots},topdown-retiring"  -C0 sleep 1
WARNING: events were regrouped to match PMUs

 Performance counter stats for 'CPU(s) 0':

        20,323,878      slots
         2,634,884      instructions
         3,028,656      topdown-retiring

       1.045076380 seconds time elapsed

In this case, topdown-retiring event is placed into previous group and
slots is adjusted to leader event.

d. multiple event groups

perf stat -e "{instructions,slots},{topdown-retiring}"  -C0 sleep 1
WARNING: events were regrouped to match PMUs

 Performance counter stats for 'CPU(s) 0':

        26,319,024      slots
         2,427,791      instructions
         2,683,508      topdown-retiring

       1.045495830 seconds time elapsed

In this case, the two groups are merged to one group and slots event is
adjusted as leader.

The key point of this patch is that it's unnecessary to move topdown
metrics events closely after slots event. It's a overkill since Intel core
PMU driver doesn't require that. Intel PMU driver just requires topdown
metrics events are in a group where slots event is the group leader, and
worse the movement for topdown metrics events causes the issue in the
commit message mentioned.

This patch doesn't block to regroup topdown metrics event. It just removes
the unnecessary movement for topdown metrics events.


> first?
>
> Thanks,
> Ian
>
>> Signed-off-by: Dapeng Mi <dapeng1.mi@...ux.intel.com>
>> ---
>>  tools/perf/arch/x86/util/evlist.c | 5 -----
>>  1 file changed, 5 deletions(-)
>>
>> diff --git a/tools/perf/arch/x86/util/evlist.c b/tools/perf/arch/x86/util/evlist.c
>> index 332e8907f43e..6046981d61cf 100644
>> --- a/tools/perf/arch/x86/util/evlist.c
>> +++ b/tools/perf/arch/x86/util/evlist.c
>> @@ -82,11 +82,6 @@ int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs)
>>                         return -1;
>>                 if (arch_is_topdown_slots(rhs))
>>                         return 1;
>> -               /* Followed by topdown events. */
>> -               if (arch_is_topdown_metrics(lhs) && !arch_is_topdown_metrics(rhs))
>> -                       return -1;
>> -               if (!arch_is_topdown_metrics(lhs) && arch_is_topdown_metrics(rhs))
>> -                       return 1;
>>         }
>>
>>         /* Default ordering by insertion index. */
>> --
>> 2.40.1
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ