[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 3 Mar 2016 09:21:30 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: linux-kernel@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>,
Andi Kleen <ak@...ux.intel.com>,
Brendan Gregg <brendan.d.gregg@...il.com>,
David Ahern <dsahern@...il.com>, He Kuang <hekuang@...wei.com>,
Jeff Bastian <jbastian@...hat.com>,
Jeremie Galarneau <jeremie.galarneau@...icios.com>,
Jiri Olsa <jolsa@...nel.org>,
Josh Boyer <jwboyer@...oraproject.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
Li Zefan <lizefan@...wei.com>,
Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
Namhyung Kim <namhyung@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, pi3orama@....com,
Stephane Eranian <eranian@...gle.com>,
Steven Rostedt <rostedt@...dmis.org>,
Taeung Song <treeze.taeung@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Wang Nan <wangnan0@...wei.com>,
Arnaldo Carvalho de Melo <acme@...hat.com>
Subject: Re: [GIT PULL 00/11] perf/core improvements and fixes
* Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
> The following changes since commit 675965b00d734c985e4285f5bec7e524d15fc4e1:
>
> perf: Export perf_event_sysfs_show() (2016-02-29 09:35:27 +0100)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160229
>
> for you to fetch changes up to 575a02e00b11eecbbabcb1eb22eab4c68e91ae77:
>
> perf record: Ensure return non-zero rc when mmap fail (2016-02-29 12:44:15 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements and fixes:
>
> User visible:
>
> - Check existence of frontend/backed stalled cycles in 'perf stat' (Andi Kleen)
>
> - Avoid installing .o files from tools/lib/ into the python extension (Jiri Olsa)
>
> - Rename the tracepoint '/format' field that carries the syscall ID from 'nr',
> that is also the name of some syscalls arguments, to "__syscall_nr", to
> avoid having multiple fields with the same name, that was breaking the
> python script skeleton generator from perf.data files (Taeung Song)
>
> - Support converting data from bpf events in 'perf data' (Wang Nan)
>
> Infrastructure:
>
> - Split libtraceevent's pevent_print_event() (Steven Rostedt)
>
> - Librarize some 'perf record' bits to allow handling multiple perf.data
> files per session (Wang Nan)
>
> - Ensure return non-zero rc when mmap fail in 'perf record' (Wang Nan)
>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com>
>
> ----------------------------------------------------------------
> Andi Kleen (1):
> perf stat: Check existence of frontend/backed stalled cycles
>
> Jiri Olsa (1):
> perf tools: Fix python extension build
>
> Steven Rostedt (1):
> tools lib traceevent: Split pevent_print_event() into specific functionality functions
>
> Taeung Song (2):
> perf trace: Check and discard not only 'nr' but also '__syscall_nr'
> tracing/syscalls: Rename "/format" tracepoint field name "nr" to "__syscall_nr:
>
> Wang Nan (6):
> perf data: Support converting data from bpf_perf_event_output()
> perf data: Explicitly set byte order for integer types
> perf record: Use WARN_ONCE to replace 'if' condition
> perf record: Extract synthesize code to record__synthesize()
> perf record: Introduce record__finish_output() to finish a perf.data
> perf record: Ensure return non-zero rc when mmap fail
>
> kernel/trace/trace_syscalls.c | 16 ++--
> tools/lib/traceevent/event-parse.c | 136 +++++++++++++++++++++++-------
> tools/lib/traceevent/event-parse.h | 13 +++
> tools/perf/builtin-record.c | 168 ++++++++++++++++++++++---------------
> tools/perf/builtin-stat.c | 22 ++++-
> tools/perf/builtin-trace.c | 8 +-
> tools/perf/util/data-convert-bt.c | 118 +++++++++++++++++++++++++-
> tools/perf/util/setup.py | 4 +
> 8 files changed, 372 insertions(+), 113 deletions(-)
Hm, there's a 'perf stat' regression that I can see:
Before:
triton:~/tip> perf stat -a sleep 1
Performance counter stats for 'system wide':
11990.023100 task-clock (msec) # 11.981 CPUs utilized
8,802 context-switches # 0.734 K/sec
543 cpu-migrations # 0.045 K/sec
97,375 page-faults # 0.008 M/sec
9,854,385,894 cycles # 0.822 GHz
15,274,841,152 stalled-cycles-frontend # 155.01% frontend cycles idle
<not supported> stalled-cycles-backend
9,634,486,137 instructions # 0.98 insn per cycle
# 1.59 stalled cycles per insn
1,818,488,088 branches # 151.667 M/sec
46,365,120 branch-misses # 2.55% of all branches
1.000741599 seconds time elapsed
After:
triton:~/tip> perf stat -a sleep 1
Performance counter stats for 'system wide':
11989.280397 task-clock (msec) # 11.981 CPUs utilized
1299 context-switches # 0.108 K/sec
6 cpu-migrations # 0.001 K/sec
70 page-faults # 0.006 K/sec
127008602 cycles # 0.011 GHz
279538533 stalled-cycles-frontend # 220.09% frontend cycles idle
119213269 instructions # 0.94 insn per cycle
# 2.34 stalled cycles per insn
24166678 branches # 2.016 M/sec
505681 branch-misses # 2.09% of all branches
1.000684278 seconds time elapsed
... see how the numbers became human-unreadable, losing the big-number separator?
I suspect it's due to the following commit:
fa184776ac27 perf stat: Check existence of frontend/backed stalled cycles
Thanks,
Ingo
Powered by blists - more mailing lists