linux-kernel - Re: [PATCH v3 1/3] perf report: Change sort order by a specified event in group

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fcedac74-7fb4-1c3b-67a3-9af24c256f40@linux.intel.com>
Date:   Tue, 17 Dec 2019 09:47:01 +0800
From:   "Jin, Yao" <yao.jin@...ux.intel.com>
To:     Jiri Olsa <jolsa@...hat.com>
Cc:     acme@...nel.org, jolsa@...nel.org, peterz@...radead.org,
        mingo@...hat.com, alexander.shishkin@...ux.intel.com,
        Linux-kernel@...r.kernel.org, ak@...ux.intel.com,
        kan.liang@...el.com, yao.jin@...el.com
Subject: Re: [PATCH v3 1/3] perf report: Change sort order by a specified
 event in group



On 12/16/2019 3:31 PM, Jiri Olsa wrote:
> On Thu, Dec 12, 2019 at 08:33:35PM +0800, Jin Yao wrote:
> 
> SNIP
> 
>>
>> Signed-off-by: Jin Yao <yao.jin@...ux.intel.com>
>> ---
>>   tools/perf/Documentation/perf-report.txt |   4 +
>>   tools/perf/builtin-report.c              |  10 +++
>>   tools/perf/ui/hist.c                     | 108 +++++++++++++++++++----
>>   tools/perf/util/symbol_conf.h            |   1 +
>>   4 files changed, 108 insertions(+), 15 deletions(-)
>>
>> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
>> index 8dbe2119686a..9ade613ef020 100644
>> --- a/tools/perf/Documentation/perf-report.txt
>> +++ b/tools/perf/Documentation/perf-report.txt
>> @@ -371,6 +371,10 @@ OPTIONS
>>   	Show event group information together. It forces group output also
>>   	if there are no groups defined in data file.
>>   
>> +--group-sort-idx::
>> +	Sort the output by the event at the index n in group. If n is invalid,
>> +	sort by the first event. WARNING: This should be used with --group.
> 
> --group in record or report?
> 

This --group is in perf-report. So even if it's not created with -e '{}' 
in perf-record, it still supports to show event information together.

> you can also create groups with -e '{}', not just --group option
> 
> I wonder you could check early on 'evlist->nr_groups' and fail
> if there's no group defined if the option is enabled
> 

Maybe we don't need to check that because it supports the case of no 
group defined.

For example,
perf record -e cycles,instructions
perf report --group --group-sort-idx 1 --stdio

#         Overhead  Command  Shared Object      Symbol
# ................  .......  ................. 
.............................
#
      0.00%  39.68%  swapper  [kernel.kallsyms]  [k] file_free_rcu
      0.00%  31.85%  swapper  [kernel.kallsyms]  [k] ___perf_sw_event
      3.96%   4.68%  swapper  [kernel.kallsyms]  [k] 
tick_nohz_idle_stop_tick
      0.00%   4.49%  perf     [kernel.kallsyms]  [k] update_load_avg
      0.00%   4.49%  swapper  [kernel.kallsyms]  [k] tcp_ack
      4.06%   3.75%  swapper  [kernel.kallsyms]  [k] update_rt_rq_load_avg
      3.54%   3.71%  swapper  [kernel.kallsyms]  [k] update_blocked_averages
      1.88%   2.06%  swapper  [kernel.kallsyms]  [k] load_balance

The output is sorted by the second event.

> also what happens when we have more groups?
>

Let me take an example.

perf record -e '{cycles,instructions}' -e '{branches,cache-misses}' -a

perf report --group  --group-sort-idx 1 --stdio

# Samples: 116  of events 'anon group { cycles, instructions }'
# Event count (approx.): 17552917
#
#         Overhead  Command          Shared Object      Symbol
# ................  ...............  ................. 
...............................
#
      0.00%  13.81%  perf             [kernel.kallsyms]  [k] number
      0.00%  12.50%  swapper          [kernel.kallsyms]  [k] cpuidle_select
      0.00%  12.02%  sleep            [kernel.kallsyms]  [k] 
filemap_map_pages
      0.00%  11.40%  perf             [kernel.kallsyms]  [k] rcu_all_qs
      0.00%  11.30%  perf             [kernel.kallsyms]  [k] alloc_set_pte
      0.00%  10.54%  swapper          [kernel.kallsyms]  [k] 
timekeeping_advance
      0.00%   9.89%  kworker/5:2-mm_  [kernel.kallsyms]  [k] 
update_rt_rq_load_avg
      0.00%   7.86%  rcu_sched        [kernel.kallsyms]  [k] 
finish_task_switch
......

# Samples: 100  of events 'anon group { branches, cache-misses }'
# Event count (approx.): 1214391
#
#         Overhead  Command          Shared Object            Symbol
# ................  ...............  ....................... 
...............................
#
      0.00%  23.13%  perf             [kernel.kallsyms]        [k] 
ext4_dirty_inode
      0.00%  18.66%  sleep            [kernel.kallsyms]        [k] 
free_pipe_info
      0.00%  10.74%  perf             [kernel.kallsyms]        [k] 
unmap_page_range
      0.00%   9.35%  sleep            [kernel.kallsyms]        [k] 
__wake_up_common_lock
      0.00%   7.71%  wpa_supplicant   libc-2.27.so             [.] 
cfree@...BC_2.2.5
      0.00%   6.55%  thermald         libglib-2.0.so.0.5600.4  [.] 
g_mutex_lock
      0.00%   3.43%  perf             [kernel.kallsyms]        [k] 
_raw_spin_lock
      0.00%   3.12%  migration/0      [kernel.kallsyms]        [k] 
kthread_should_stop
      0.00%   2.87%  migration/1      [kernel.kallsyms]        [k] 
__bitmap_and
      0.00%   2.37%  migration/3      [kernel.kallsyms]        [k] 
update_rt_rq_load_avg
      0.00%   2.01%  perf             [kernel.kallsyms]        [k] 
security_inode_permission
      0.00%   1.91%  migration/5      [kernel.kallsyms]        [k] 
update_sd_lb_stats
      0.00%   1.86%  perf             [kernel.kallsyms]        [k] 
__alloc_file
      0.00%   1.83%  swapper          [kernel.kallsyms]        [k] 
intel_idle
      0.00%   1.49%  migration/2      [kernel.kallsyms]        [k] 
__switch_to_asm
      0.00%   1.41%  migration/6      [kernel.kallsyms]        [k] 
__bitmap_and
......

The outputs are sorted by the second event.

Let me take one more example of multiple groups with different amount of 
events.

perf record -e '{cycles,instructions}' -e 
'{branches,cache-misses,cache-references}' -a

perf report --group --group-sort-idx 2 --stdio

# Samples: 135  of events 'anon group { cycles, instructions }'
# Event count (approx.): 20558030
#
#         Overhead  Command          Shared Object             Symbol
# ................  ...............  ........................ 
...................................
#
     21.52%   0.00%  perf             [kernel.kallsyms]         [k] 
anon_vma_interval_tree_remove
     10.63%   0.00%  perf             [kernel.kallsyms]         [k] strncpy
      9.61%   0.00%  migration/5      [kernel.kallsyms]         [k] 
put_prev_task_stop
      8.64%   0.00%  migration/3      [kernel.kallsyms]         [k] 
update_blocked_averages
      6.53%   0.00%  migration/2      [kernel.kallsyms]         [k] 
update_sd_lb_stats
      5.41%   0.00%  migration/0      [kernel.kallsyms]         [k] rb_erase
      5.32%   0.00%  migration/4      [kernel.kallsyms]         [k] 
reweight_entity
      4.90%   0.00%  swapper          [kernel.kallsyms]         [k] 
e1000_clean_rx_irq
      4.35%   0.00%  migration/1      [kernel.kallsyms]         [k] schedule
      3.94%   0.00%  migration/6      [kernel.kallsyms]         [k] 
propagate_entity_cfs_rq.isra.97
......

The output for group '{cycles, instructions}' is sorted by default 
(first event) since the group-sort-idx (2) is invalid.

# Samples: 187  of events 'anon group { branches, cache-misses, 
cache-references }'
# Event count (approx.): 2428023
#
#                 Overhead  Command          Shared Object 
Symbol
# ........................  ...............  ........................ 
.........................................
#
      0.00%   0.00%  21.54%  perf             [kernel.kallsyms] 
[k] up_write
      0.00%   0.00%  10.01%  swapper          [kernel.kallsyms] 
[k] note_gp_changes
      0.00%   0.00%   9.51%  kworker/u16:1-e  [kernel.kallsyms] 
[k] finish_task_switch
      0.00%  12.48%   8.06%  swapper          [kernel.kallsyms] 
[k] intel_idle
      0.00%   0.00%   4.08%  perf             [kernel.kallsyms] 
[k] kmem_cache_alloc
      0.00%   0.00%   3.83%  migration/3      [kernel.kallsyms] 
[k] update_blocked_averages
      0.00%   0.00%   3.58%  migration/5      [kernel.kallsyms] 
[k] pick_next_task_rt
      0.00%   0.00%   3.46%  swapper          [kernel.kallsyms] 
[k] rcu_dynticks_eqs_exit
      0.00%   0.00%   3.22%  swapper          [kernel.kallsyms] 
[k] calc_global_load
      0.00%   0.00%   3.13%  swapper          [kernel.kallsyms] 
[k] cpuidle_not_available
......

The output for group '{branches, cache-misses, cache-references}' is 
sorted by the third event. It's expected.

> I think the option is easy to use as it is, just needs to be explained
> consequences for more groups with different amount of events
> 

Thanks. Can we say something as following?

--group-sort-idx::
	Sort the output by the event at the index n in group. If n is invalid, 
sort by the first event. It can support multiple groups with different 
amount of events. WARNING: This should be used with perf report --group.

> SNIP
> 
>> +
>> +static int __hpp__group_sort_idx(struct hist_entry *a, struct hist_entry *b,
>> +				 hpp_field_fn get_field, int idx)
>> +{
>> +	struct evsel *evsel = hists_to_evsel(a->hists);
>> +	u64 *fields_a, *fields_b;
>> +	int cmp, nr_members, ret, i;
>> +
>> +	cmp = field_cmp(get_field(a), get_field(b));
>> +	if (!perf_evsel__is_group_event(evsel))
>> +		return cmp;
>> +
>> +	nr_members = evsel->core.nr_members;
>> +	ret = pair_fields_alloc(a, b, get_field, nr_members,
>> +				&fields_a, &fields_b);
>> +	if (ret) {
>> +		ret = cmp;
>> +		goto out;
>> +	}
>> +
>> +	if (idx >= 1 && idx < nr_members) {
> 
> do we need to continue here if idx is out of the limit?
> I'm not sure but mybe in that case the comparison above
> should be enough?
> 

Yes, we can use simpler code. Something like,

+       cmp = field_cmp(get_field(a), get_field(b));
+       if (!perf_evsel__is_group_event(evsel))
+               return cmp;
+
+       nr_members = evsel->core.nr_members;
+       if (idx < 1 || idx >= nr_members)
+               return cmp;
         ......

Thanks
Jin Yao

> thanks,
> jirka
>