[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <709fd95a-d379-da99-3816-83797d3d5411@linux.intel.com>
Date: Tue, 16 Mar 2021 12:39:38 +0800
From: "Jin, Yao" <yao.jin@...ux.intel.com>
To: Jiri Olsa <jolsa@...hat.com>
Cc: acme@...nel.org, jolsa@...nel.org, peterz@...radead.org,
mingo@...hat.com, alexander.shishkin@...ux.intel.com,
Linux-kernel@...r.kernel.org, ak@...ux.intel.com,
kan.liang@...el.com, yao.jin@...el.com
Subject: Re: [PATCH v2 17/27] perf evsel: Adjust hybrid event and global event
mixed group
Hi Jiri,
On 3/16/2021 7:04 AM, Jiri Olsa wrote:
> On Thu, Mar 11, 2021 at 03:07:32PM +0800, Jin Yao wrote:
>> A group mixed with hybrid event and global event is allowed. For example,
>> group leader is 'cpu-clock' and the group member is 'cpu_atom/cycles/'.
>>
>> e.g.
>> perf stat -e '{cpu-clock,cpu_atom/cycles/}' -a
>>
>> The challenge is their available cpus are not fully matched.
>> For example, 'cpu-clock' is available on CPU0-CPU23, but 'cpu_core/cycles/'
>> is available on CPU16-CPU23.
>>
>> When getting the group id for group member, we must be very careful
>> because the cpu for 'cpu-clock' is not equal to the cpu for 'cpu_atom/cycles/'.
>> Actually the cpu here is the index of evsel->core.cpus, not the real CPU ID.
>> e.g. cpu0 for 'cpu-clock' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.
>>
>> Another challenge is for group read. The events in group may be not
>> available on all cpus. For example the leader is a software event and
>> it's available on CPU0-CPU1, but the group member is a hybrid event and
>> it's only available on CPU1. For CPU0, we have only one event, but for CPU1
>> we have two events. So we need to change the read size according to
>> the real number of events on that cpu.
>
> ugh, this is really bad.. do we really want to support it? ;-)
> I guess we need that for metrics..
>
Yes, it's a bit of pain but the user case makes sense. Some metrics need the event group which
consists of global event + hybrid event.
For example, CPU_Utilization = 'cpu_clk_unhalted.ref_tsc' / 'msr/tsc/'.
'msr/tsc/' is a global event. It's valid on all CPUs.
But 'cpu_clk_unhalted.ref' is hybrid event.
'cpu_core/cpu_clk_unhalted.ref/' is valid on core CPUs
'cpu_atom/cpu_clk_unhalted.ref/' is valid on atom CPUs.
So we have to support this usage. :)
> SNIP
>
>>
>> Performance counter stats for 'system wide':
>>
>> 24,059.14 msec cpu-clock # 23.994 CPUs utilized
>> 6,406,677,892 cpu_atom/cycles/ # 266.289 M/sec
>>
>> 1.002699058 seconds time elapsed
>>
>> For cpu_atom/cycles/, cpu16-cpu23 are set with valid group fd (cpu-clock's fd
>> on that cpu). For counting results, cpu-clock has 24 cpus aggregation and
>> cpu_atom/cycles/ has 8 cpus aggregation. That's expected.
>>
>> But if the event order is changed, e.g. '{cpu_atom/cycles/,cpu-clock}',
>> there leaves more works to do.
>>
>> root@...-pwrt-002:~# ./perf stat -e '{cpu_atom/cycles/,cpu-clock}' -a -vvv -- sleep 1
>
> what id you add the other hybrid pmu event? or just cycles?
>
Do you mean the config for cpu_atom/cycles/? Let's see the log.
root@...-pwrt-002:~# perf stat -e '{cpu_atom/cycles/,cpu-clock}' -a -vvv -- sleep 1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 3
sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 11
------------------------------------------------------------
perf_event_attr:
type 1
size 120
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 19
sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 21
sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 22
sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 23
sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 24
sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 25
sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 26
sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 27
sys_perf_event_open: pid -1 cpu 16 group_fd 3 flags 0x8 = 28
sys_perf_event_open: pid -1 cpu 17 group_fd 4 flags 0x8 = 29
sys_perf_event_open: pid -1 cpu 18 group_fd 5 flags 0x8 = 30
sys_perf_event_open: pid -1 cpu 19 group_fd 7 flags 0x8 = 31
sys_perf_event_open: pid -1 cpu 20 group_fd 8 flags 0x8 = 32
sys_perf_event_open: pid -1 cpu 21 group_fd 9 flags 0x8 = 33
sys_perf_event_open: pid -1 cpu 22 group_fd 10 flags 0x8 = 34
sys_perf_event_open: pid -1 cpu 23 group_fd 11 flags 0x8 = 35
cycles: 0: 800791792 1002389889 1002389889
cycles: 1: 800788198 1002383611 1002383611
cycles: 2: 800783491 1002377507 1002377507
cycles: 3: 800777752 1002371035 1002371035
cycles: 4: 800771559 1002363669 1002363669
cycles: 5: 800766391 1002356944 1002356944
cycles: 6: 800761593 1002350144 1002350144
cycles: 7: 800756258 1002343203 1002343203
WARNING: for cpu-clock, some CPU counts not read
cpu-clock: 0: 0 0 0
cpu-clock: 1: 0 0 0
cpu-clock: 2: 0 0 0
cpu-clock: 3: 0 0 0
cpu-clock: 4: 0 0 0
cpu-clock: 5: 0 0 0
cpu-clock: 6: 0 0 0
cpu-clock: 7: 0 0 0
cpu-clock: 8: 0 0 0
cpu-clock: 9: 0 0 0
cpu-clock: 10: 0 0 0
cpu-clock: 11: 0 0 0
cpu-clock: 12: 0 0 0
cpu-clock: 13: 0 0 0
cpu-clock: 14: 0 0 0
cpu-clock: 15: 0 0 0
cpu-clock: 16: 1002390566 1002389889 1002389889
cpu-clock: 17: 1002383263 1002383611 1002383611
cpu-clock: 18: 1002377257 1002377507 1002377507
cpu-clock: 19: 1002370895 1002371035 1002371035
cpu-clock: 20: 1002363611 1002363669 1002363669
cpu-clock: 21: 1002356623 1002356944 1002356944
cpu-clock: 22: 1002349562 1002350144 1002350144
cpu-clock: 23: 1002343089 1002343203 1002343203
cycles: 6406197034 8018936002 8018936002
cpu-clock: 8018934866 8018936002 8018936002
Performance counter stats for 'system wide':
6,406,197,034 cpu_atom/cycles/ # 798.884 M/sec
8,018.93 msec cpu-clock # 7.999 CPUs utilized
1.002475994 seconds time elapsed
'0xa00000000' is the config for cpu_atom/cycles/ and we can see this event is only enabled on CPU16-23.
'cpu-clock' is a global event and it's enable don CPU0-23 (but only have valid group fd on CPU16-23).
>
> SNIP
>
>> +static int hybrid_read_size(struct evsel *leader, int cpu, int *nr_members)
>> +{
>> + struct evsel *pos;
>> + int nr = 1, back, new_size = 0, idx;
>> +
>> + for_each_group_member(pos, leader) {
>> + idx = evsel_cpuid_match(leader, pos, cpu);
>> + if (idx != -1)
>> + nr++;
>> + }
>> +
>> + if (nr != leader->core.nr_members) {
>> + back = leader->core.nr_members;
>> + leader->core.nr_members = nr;
>> + new_size = perf_evsel__read_size(&leader->core);
>> + leader->core.nr_members = back;
>> + }
>> +
>> + *nr_members = nr;
>> + return new_size;
>> +}
>> +
>> static int evsel__read_group(struct evsel *leader, int cpu, int thread)
>> {
>> struct perf_stat_evsel *ps = leader->stats;
>> u64 read_format = leader->core.attr.read_format;
>> int size = perf_evsel__read_size(&leader->core);
>> + int new_size, nr_members;
>> u64 *data = ps->group_data;
>>
>> if (!(read_format & PERF_FORMAT_ID))
>> return -EINVAL;
>
> I wonder if we do not find some reasonable generic way to process
> this, porhaps we should make some early check that this evlist has
> hybrid event and the move the implementation in some separated
> hybrid-XXX object, so we don't confuse the code
>
> jirka
>
Agree. The code looks a bit tricky and hard to understand for non-hybrid platform.
Maybe it's a good idea to move this code to hybrid-xxx files.
Thanks
Jin Yao
Powered by blists - more mailing lists