linux-kernel - Re: [PATCH v2 17/27] perf evsel: Adjust hybrid event and global event mixed group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YE/oF4pOkyQmm0rI@krava>
Date:   Tue, 16 Mar 2021 00:04:55 +0100
From:   Jiri Olsa <jolsa@...hat.com>
To:     Jin Yao <yao.jin@...ux.intel.com>
Cc:     acme@...nel.org, jolsa@...nel.org, peterz@...radead.org,
        mingo@...hat.com, alexander.shishkin@...ux.intel.com,
        Linux-kernel@...r.kernel.org, ak@...ux.intel.com,
        kan.liang@...el.com, yao.jin@...el.com
Subject: Re: [PATCH v2 17/27] perf evsel: Adjust hybrid event and global
 event mixed group

On Thu, Mar 11, 2021 at 03:07:32PM +0800, Jin Yao wrote:
> A group mixed with hybrid event and global event is allowed. For example,
> group leader is 'cpu-clock' and the group member is 'cpu_atom/cycles/'.
> 
> e.g.
> perf stat -e '{cpu-clock,cpu_atom/cycles/}' -a
> 
> The challenge is their available cpus are not fully matched.
> For example, 'cpu-clock' is available on CPU0-CPU23, but 'cpu_core/cycles/'
> is available on CPU16-CPU23.
> 
> When getting the group id for group member, we must be very careful
> because the cpu for 'cpu-clock' is not equal to the cpu for 'cpu_atom/cycles/'.
> Actually the cpu here is the index of evsel->core.cpus, not the real CPU ID.
> e.g. cpu0 for 'cpu-clock' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.
> 
> Another challenge is for group read. The events in group may be not
> available on all cpus. For example the leader is a software event and
> it's available on CPU0-CPU1, but the group member is a hybrid event and
> it's only available on CPU1. For CPU0, we have only one event, but for CPU1
> we have two events. So we need to change the read size according to
> the real number of events on that cpu.

ugh, this is really bad.. do we really want to support it? ;-)
I guess we need that for metrics..

SNIP

> 
>    Performance counter stats for 'system wide':
> 
>            24,059.14 msec cpu-clock                 #   23.994 CPUs utilized
>        6,406,677,892      cpu_atom/cycles/          #  266.289 M/sec
> 
>          1.002699058 seconds time elapsed
> 
> For cpu_atom/cycles/, cpu16-cpu23 are set with valid group fd (cpu-clock's fd
> on that cpu). For counting results, cpu-clock has 24 cpus aggregation and
> cpu_atom/cycles/ has 8 cpus aggregation. That's expected.
> 
> But if the event order is changed, e.g. '{cpu_atom/cycles/,cpu-clock}',
> there leaves more works to do.
> 
>   root@...-pwrt-002:~# ./perf stat -e '{cpu_atom/cycles/,cpu-clock}' -a -vvv -- sleep 1

what id you add the other hybrid pmu event? or just cycles?


SNIP
  
> +static int hybrid_read_size(struct evsel *leader, int cpu, int *nr_members)
> +{
> +	struct evsel *pos;
> +	int nr = 1, back, new_size = 0, idx;
> +
> +	for_each_group_member(pos, leader) {
> +		idx = evsel_cpuid_match(leader, pos, cpu);
> +		if (idx != -1)
> +			nr++;
> +	}
> +
> +	if (nr != leader->core.nr_members) {
> +		back = leader->core.nr_members;
> +		leader->core.nr_members = nr;
> +		new_size = perf_evsel__read_size(&leader->core);
> +		leader->core.nr_members = back;
> +	}
> +
> +	*nr_members = nr;
> +	return new_size;
> +}
> +
>  static int evsel__read_group(struct evsel *leader, int cpu, int thread)
>  {
>  	struct perf_stat_evsel *ps = leader->stats;
>  	u64 read_format = leader->core.attr.read_format;
>  	int size = perf_evsel__read_size(&leader->core);
> +	int new_size, nr_members;
>  	u64 *data = ps->group_data;
>  
>  	if (!(read_format & PERF_FORMAT_ID))
>  		return -EINVAL;

I wonder if we do not find some reasonable generic way to process
this, porhaps we should make some early check that this evlist has
hybrid event and the move the implementation in some separated
hybrid-XXX object, so we don't confuse the code

jirka