lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 21 Dec 2015 23:26:43 +0900
From:	Namhyung Kim <namhyung@...nel.org>
To:	Arnaldo Carvalho de Melo <acme@...nel.org>
Cc:	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Jiri Olsa <jolsa@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	David Ahern <dsahern@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Andi Kleen <andi@...stfloor.org>,
	Wang Nan <wangnan0@...wei.com>
Subject: [PATCHSET 00/10] perf tools: Support dynamic sort keys for tracepoints (v3)

Hello,

This is an attempt to improve perf to deal with tracepoint events
better.  The perf tools can handle tracepoint events but perf report
on them is less useful since they're always sampled in a fixed
location and not provide event specific info.  We can use perf script
but I always wishes there's more convenient way to see the result.

 * changes in v3)
  - save trace_output for dynamic entries  (Jiri)
  - update field length for each entry

 * changes in v2)
  - add 'trace' sort key and make it default  (Jiri)
  - add '--raw-trace' option and '/raw' field modifier  (Jiri)
  - support event name shortcuts  (David)


I suggest dynamic sort keys created for each event.field on demand.
Consider following example:

  # Overhead  Command          Shared Object     Symbol
  # ........  ...............  ................  ..............
  #
      47.22%  swapper          [kernel.vmlinux]  [k] __schedule
      21.67%  transmission-gt  [kernel.vmlinux]  [k] __schedule
       8.23%  netctl-auto      [kernel.vmlinux]  [k] __schedule
       5.53%  kworker/0:1H     [kernel.vmlinux]  [k] __schedule
       1.98%  Xephyr           [kernel.vmlinux]  [k] __schedule
       1.33%  irq/33-iwlwifi   [kernel.vmlinux]  [k] __schedule
       1.17%  wpa_cli          [kernel.vmlinux]  [k] __schedule
       1.13%  rcu_preempt      [kernel.vmlinux]  [k] __schedule
       0.85%  ksoftirqd/0      [kernel.vmlinux]  [k] __schedule
       0.77%  Timer            [kernel.vmlinux]  [k] __schedule
  ...

Currently perf report only shows this but important info is on the
event fields, that is:

  # sudo cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
  name: sched_switch
  ID: 268
  format:
    field:unsigned short common_type;         offset:0; size:2; signed:0;
    field:unsigned char common_flags;         offset:2; size:1; signed:0;
    field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
    field:int common_pid;                     offset:4; size:4; signed:1;

    field:char prev_comm[16]; offset:8;  size:16; signed:1;
    field:pid_t prev_pid;     offset:24; size:4;  signed:1;
    field:int prev_prio;      offset:28; size:4;  signed:1;
    field:long prev_state;    offset:32; size:8;  signed:1;
    field:char next_comm[16]; offset:40; size:16; signed:1;
    field:pid_t next_pid;     offset:56; size:4;  signed:1;
    field:int next_prio;      offset:60; size:4;  signed:1;

  print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==>
              next_comm=%s next_pid=%d next_prio=%d",
    REC->prev_comm, REC->prev_pid, REC->prev_prio,
    REC->prev_state & (2048-1) ? __print_flags(REC->prev_state & (2048-1),
    "|", { 1, "S"} , { 2, "D" }, { 4, "T" }, { 8, "t" }, { 16, "Z" }, { 32, "X" },
    { 64, "x" }, { 128, "K"}, { 256, "W" }, { 512, "P" }, { 1024, "N" }) : "R",
    REC->prev_state & 2048 ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio

With dynamic sort keys, you can use <event.field> as a sort key.  Those
dynamic keys are checked and created on demand.  For instance, below is
to sort by next_pid field on the same data file.

  $ perf report -s comm,sched:sched_switch.next_pid --stdio
  ...
  # Overhead  Command            next_pid
  # ........  ...............  ..........
  #
      21.23%  transmission-gt           0
      20.86%  swapper               17773
       6.62%  netctl-auto               0
       5.25%  swapper                 109
       5.21%  kworker/0:1H              0
       1.98%  Xephyr                    0
       1.98%  swapper                6524
       1.98%  swapper               27478
       1.37%  swapper               27476
       1.17%  swapper                 233
  ...

Multiple dynamic sort keys are also supported and the event name can
be simplified (or even omitted in a couple of ways (see patch 10/10):

  $ perf report -s comm,switch.next_pid,next_comm --stdio
  ...
  # Overhead  Command            next_pid         next_comm
  # ........  ...............  ..........  ................
  #
      20.86%  swapper               17773   transmission-gt
       9.64%  transmission-gt           0         swapper/0
       9.16%  transmission-gt           0         swapper/2
       5.25%  swapper                 109      kworker/0:1H
       5.21%  kworker/0:1H              0         swapper/0
       2.14%  netctl-auto               0         swapper/2
       1.98%  netctl-auto               0         swapper/0
       1.98%  swapper                6524            Xephyr
       1.98%  swapper               27478       netctl-auto
       1.78%  transmission-gt           0         swapper/3
       1.53%  Xephyr                    0         swapper/0
       1.29%  netctl-auto               0         swapper/1
       1.29%  swapper               27476       netctl-auto
       1.21%  netctl-auto               0         swapper/3
       1.17%  swapper                 233    irq/33-iwlwifi
  ...

Note that pid 0 exists for each cpu so have comm of 'swapper/N'.

Also now it add a new 'trace' sort key to print whole trace output.
(The 'trace' sort key is the default key for tracepoint events now).
Below is the output using sched_switch plugin.

  $ perf report -s trace --stdio
  ...
  # Overhead  Trace output                                                 
  # ........  ...................................................
  #
       9.48%  swapper/0:0 [120] R ==> transmission-gt:17773 [120]          
       9.48%  transmission-gt:17773 [120] S ==> swapper/0:0 [120]          
       9.04%  swapper/2:0 [120] R ==> transmission-gt:17773 [120]          
       8.92%  transmission-gt:17773 [120] S ==> swapper/2:0 [120]          
       5.25%  swapper/0:0 [120] R ==> kworker/0:1H:109 [100]               
       5.21%  kworker/0:1H:109 [100] S ==> swapper/0:0 [120]               
       1.78%  swapper/3:0 [120] R ==> transmission-gt:17773 [120]          
       1.78%  transmission-gt:17773 [120] S ==> swapper/3:0 [120]          
       1.53%  Xephyr:6524 [120] S ==> swapper/0:0 [120]                    
       1.53%  swapper/0:0 [120] R ==> Xephyr:6524 [120]                    
       1.17%  swapper/2:0 [120] R ==> irq/33-iwlwifi:233 [49]              
       1.13%  irq/33-iwlwifi:233 [49] S ==> swapper/2:0 [120]             
  ...

The --raw-trace option is to control display of raw field info

  $ perf report -s trace --stdio --raw-trace
  ...
  # Overhead  Trace output                                                                                   
  # ........  ...........................................................................................................
  #
       9.48%   prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=0 next_comm=transmission-gt next_pid=17773 next_pr
       9.48%   prev_comm=transmission-gt prev_pid=17773 prev_prio=120 prev_state=1 next_comm=swapper/0 next_pid=0 next_pr
       9.04%   prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=0 next_comm=transmission-gt next_pid=17773 next_pr
       8.92%   prev_comm=transmission-gt prev_pid=17773 prev_prio=120 prev_state=1 next_comm=swapper/2 next_pid=0 next_pr
       5.25%   prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=0 next_comm=kworker/0:1H next_pid=109 next_prio=10
       5.21%   prev_comm=kworker/0:1H prev_pid=109 prev_prio=100 prev_state=1 next_comm=swapper/0 next_pid=0 next_prio=12
       1.78%   prev_comm=swapper/3 prev_pid=0 prev_prio=120 prev_state=0 next_comm=transmission-gt next_pid=17773 next_pr
       1.78%   prev_comm=transmission-gt prev_pid=17773 prev_prio=120 prev_state=1 next_comm=swapper/3 next_pid=0 next_pr
       1.53%   prev_comm=Xephyr prev_pid=6524 prev_prio=120 prev_state=1 next_comm=swapper/0 next_pid=0 next_prio=120    
       1.53%   prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=0 next_comm=Xephyr next_pid=6524 next_prio=120    
       1.17%   prev_comm=swapper/2 prev_pid=0 prev_prio=120 prev_state=0 next_comm=irq/33-iwlwifi next_pid=233 next_prio=
       1.13%   prev_comm=irq/33-iwlwifi prev_pid=233 prev_prio=49 prev_state=1 next_comm=swapper/2 next_pid=0 next_prio=1
  ...

This is available on 'perf/dynamic-sort-v3' branch in my tree

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Any comments are welcome, thanks!
Namhyung


Namhyung Kim (10):
  perf hist: Pass struct sample to __hists__add_entry()
  perf hist: Save raw_data/size for tracepoint events
  tools lib traceevent: Factor out and export print_event_field[s]
  perf tools: Pass evlist to setup_sorting()
  perf tools: Add dynamic sort key for tracepoint events
  perf tools: Try to show pretty printed output for dynamic sort keys
  perf tools: Add 'trace' sort key
  perf tools: Add --raw-trace option
  perf tools: Make 'trace' sort key default for tracepoint events
  perf tools: Support shortcuts for events in dynamic sort keys

 tools/lib/traceevent/event-parse.c       | 125 ++++----
 tools/lib/traceevent/event-parse.h       |   4 +
 tools/perf/Documentation/perf-report.txt |   3 +
 tools/perf/Documentation/perf-top.txt    |   3 +
 tools/perf/builtin-annotate.c            |   9 +-
 tools/perf/builtin-diff.c                |  13 +-
 tools/perf/builtin-report.c              |   4 +-
 tools/perf/builtin-top.c                 |  16 +-
 tools/perf/tests/hists_cumulate.c        |   8 +-
 tools/perf/tests/hists_filter.c          |   2 +-
 tools/perf/tests/hists_link.c            |   8 +-
 tools/perf/tests/hists_output.c          |  10 +-
 tools/perf/util/hist.c                   |  36 ++-
 tools/perf/util/hist.h                   |   5 +-
 tools/perf/util/sort.c                   | 501 ++++++++++++++++++++++++++++++-
 tools/perf/util/sort.h                   |   9 +-
 tools/perf/util/symbol.h                 |   3 +-
 17 files changed, 638 insertions(+), 121 deletions(-)

-- 
2.6.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ