lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 11 Oct 2016 14:30:55 -0300
From:   Arnaldo Carvalho de Melo <acme@...nel.org>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     linux-kernel@...r.kernel.org, Linux Weekly News <lwn@....net>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Andi Kleen <ak@...ux.intel.com>,
        David Ahern <dsahern@...il.com>,
        Don Zickus <dzickus@...hat.com>, Jiri Olsa <jolsa@...nel.org>,
        Joe Mario <jmario@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>, pi3orama@....com,
        Steven Rostedt <rostedt@...dmis.org>,
        Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>,
        Wang Nan <wangnan0@...wei.com>, Zefan Li <lizefan@...wei.com>
Subject: [GIT PULL 00/68] perf/core improvements and fixes

From: Arnaldo Carvalho de Melo <acme@...hat.com>

Hi Ingo,

	Please consider pulling,

- Arnaldo

Build and test stats at the end of the message.

The following changes since commit c68306ce20ad03ce655a367fc33ad06e12bb87a6:

  Merge tag 'perf-core-for-mingo-20161005' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2016-10-07 00:36:49 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161011

for you to fetch changes up to 193b29e31a5cfec42790a59fc453359bb6ee0ea1:

  perf jevents: Handle events including .c and .o (2016-10-11 12:34:39 -0300)

----------------------------------------------------------------
perf/core improvements and fixes:

New features:

- The 'perf c2c' tool provides means for Shared Data C2C/HITM analysis.
  It allows you to track down cacheline contention. The tool is based
  on x86's load latency and precise store facility events provided by
  Intel CPUs.

  It was tested by Joe Mario and has proven to be useful, finding som
  cacheline contentions. Joe also wrote a blog about c2c tool with
  examples:

    https://joemario.github.io/blog/2016/09/01/c2c-blog/

  There one finds extensive details on using the tool, with tips on
  reducing the volume of samples while still capturing enough to do
  its job. (Dick Fowles, Joe Mario, Don Zickus, Jiri Olsa)

- Add support in 'perf list' to show only events in vendor notation,
  built from JSON (Andi Kleen)

- Handle completion of upper case events, as users of the JSON events
  are used to. Using it as lowercase also works. (Andi Kleen)

- Report Intel-PT/BTS instruction bytes in 'perf script' (Andi Kleen)

Fixes:

- Fix handling of numa nodes in perf.data files (Jiri Olsa)

- Fix scrolling when refreshing 'perf top --tui --hierarchy' entries (Namhyung Kim)

- Fix handling of events including .c and .o, that were being treated as
  BPF scripts instead of JSON ones (Wang Nan)

Infrastructure:

- Sync copy of x86's syscall table (Arnaldo Carvalho de Melo)

- prep work for making libtraceevent more widely used (Jiri Olsa)

- Show list of features not present in a perf.data file when using
  'perf report --header-only', to help with debugging (Jiri Olsa)

- When failing to process a record, show its name, not its number (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com>

----------------------------------------------------------------
Adrian Hunter (1):
      perf intel-pt/bts: Tidy instruction buffer size usage

Andi Kleen (3):
      perf list: Add support for listing only json events
      perf tools: Handle completion of upper case events
      perf intel-pt/bts: Report instruction bytes and length in sample

Arnaldo Carvalho de Melo (1):
      perf tools: Sync copy of x86's syscall table

Jiri Olsa (61):
      perf c2c: Introduce c2c_decode_stats function
      perf c2c: Introduce c2c_add_stats function
      perf c2c: Add c2c command
      perf c2c: Add record subcommand
      perf c2c: Add report subcommand
      perf c2c report: Add dimension support
      perf c2c report: Add sort_entry dimension support
      perf c2c report: Fallback to standard dimensions
      perf c2c report: Add sample processing
      perf c2c report: Add cacheline hists processing
      perf c2c report: Decode c2c_stats for hist entries
      perf c2c report: Add header macros
      perf c2c report: Add 'dcacheline' dimension key
      perf c2c report: Add 'offset' dimension key
      perf c2c report: Add 'iaddr' dimension key
      perf c2c report: Add hitm related dimension keys
      perf c2c report: Add stores related dimension keys
      perf c2c report: Add loads related dimension keys
      perf c2c report: Add llc and remote loads related dimension keys
      perf c2c report: Add llc load miss dimension key
      perf c2c report: Add total record sort key
      perf c2c report: Add total loads sort key
      perf c2c report: Add hitm percent sort key
      perf c2c report: Add hitm/store percent related sort keys
      perf c2c report: Add dram related sort keys
      perf c2c report: Add 'pid' sort key
      perf c2c report: Add 'tid' sort key
      perf c2c report: Add 'symbol' and 'dso' sort keys
      perf c2c report: Add 'node' sort key
      perf c2c report: Add stats related sort keys
      perf c2c report: Add 'cpucnt' sort key
      perf c2c report: Add src line sort key
      perf c2c report: Setup number of header lines for hists
      perf c2c report: Set final resort fields
      perf c2c report: Add stdio output support
      perf c2c report: Add main TUI browser
      perf c2c report: Add TUI cacheline browser
      perf c2c report: Add global stats stdio output
      perf c2c report: Add shared cachelines stats stdio output
      perf c2c report: Add c2c related stats stdio output
      perf c2c report: Allow to report callchains
      perf c2c report: Limit the cachelines table entries
      perf c2c report: Add support to choose local HITMs
      perf c2c report: Allow to set cacheline sort fields
      perf c2c report: Recalc width of global sort entries
      perf c2c report: Add cacheline index entry
      perf c2c report: Add support to manage symbol name length
      perf c2c report: Iterate node display in browser
      perf c2c report: Add help windows
      perf c2c: Add man page and credits
      tools lib traceevent: Add install_headers target
      tools lib traceevent: Add do_install_mkdir Makefile function
      tools lib traceevent: Rename LIB_FILE to LIB_TARGET
      tools lib traceevent: Add version for traceevent shared object
      tools lib: Add for_each_clear_bit macro
      perf report: Move captured info to generic header info
      perf header: Display missing features
      perf header: Display feature name on write failure
      perf header: Set nr_numa_nodes only when we parsed all the data
      perf c2c report: Add --no-source option
      perf c2c report: Add --show-all option

Namhyung Kim (1):
      perf top: Fix refreshing hierarchy entries on TUI

Wang Nan (1):
      perf jevents: Handle events including .c and .o

 tools/include/asm-generic/bitops.h                 |    1 +
 tools/include/asm-generic/bitops/__ffz.h           |   12 +
 tools/include/asm-generic/bitops/find.h            |   28 +
 tools/include/linux/bitops.h                       |    5 +
 tools/lib/find_bit.c                               |   25 +
 tools/lib/traceevent/Makefile                      |   40 +-
 tools/perf/Build                                   |    1 +
 tools/perf/Documentation/perf-c2c.txt              |  282 ++
 tools/perf/Documentation/perf-list.txt             |    2 +-
 tools/perf/MANIFEST                                |    1 +
 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl  |    4 +-
 tools/perf/builtin-c2c.c                           | 2754 ++++++++++++++++++++
 tools/perf/builtin-list.c                          |    9 +-
 tools/perf/builtin.h                               |    1 +
 tools/perf/perf-completion.sh                      |    6 +-
 tools/perf/perf.c                                  |    1 +
 tools/perf/ui/browsers/hists.c                     |    5 +-
 tools/perf/ui/browsers/hists.h                     |    1 +
 tools/perf/util/event.h                            |    3 +
 tools/perf/util/header.c                           |   21 +-
 tools/perf/util/hist.c                             |    1 +
 tools/perf/util/hist.h                             |    1 +
 tools/perf/util/intel-bts.c                        |    9 +-
 .../perf/util/intel-pt-decoder/intel-pt-decoder.c  |    2 +
 .../perf/util/intel-pt-decoder/intel-pt-decoder.h  |    1 +
 .../util/intel-pt-decoder/intel-pt-insn-decoder.c  |   13 +-
 .../util/intel-pt-decoder/intel-pt-insn-decoder.h  |    6 +-
 tools/perf/util/intel-pt-decoder/intel-pt-log.c    |    4 +-
 tools/perf/util/intel-pt.c                         |   19 +-
 tools/perf/util/mem-events.c                       |  128 +
 tools/perf/util/mem-events.h                       |   37 +
 tools/perf/util/parse-events.c                     |    2 +-
 tools/perf/util/parse-events.l                     |    4 +-
 tools/perf/util/pmu.c                              |   14 +-
 tools/perf/util/pmu.h                              |    3 +-
 tools/perf/util/session.c                          |   10 -
 tools/perf/util/sort.c                             |    2 +-
 tools/perf/util/sort.h                             |    1 +
 38 files changed, 3393 insertions(+), 66 deletions(-)
 create mode 100644 tools/include/asm-generic/bitops/__ffz.h
 create mode 100644 tools/perf/Documentation/perf-c2c.txt
 create mode 100644 tools/perf/builtin-c2c.c

  [root@...et ~]# time dm
   1 66.368836810 alpine:3.4: Ok
   2 26.154146190 android-ndk:r12b-arm: Ok
   3 69.746739126 archlinux:latest: Ok
   4 39.624220291 centos:5: Ok
   5 58.689782208 centos:6: Ok
   6 69.851635081 centos:7: Ok
   7 63.079827869 debian:7: Ok
   8 68.955435266 debian:8: Ok
   9 38.571431258 debian:experimental: Ok
  10 69.558879497 fedora:20: Ok
  11 73.092759654 fedora:21: Ok
  12 72.443082285 fedora:22: Ok
  13 72.305159323 fedora:23: Ok
  14 77.316048256 fedora:24: Ok
  15 32.774333511 fedora:24-x-ARC-uClibc: Ok
  16 80.985293289 fedora:rawhide: Ok
  17 79.388121697 mageia:5: Ok
  18 72.485900821 opensuse:13.2: Ok
  19 73.519405793 opensuse:42.1: Ok
  20 81.367665352 opensuse:tumbleweed: Ok
  21 56.263699207 ubuntu:12.04.5: Ok
  22 38.300297066 ubuntu:14.04: Ok
  23 68.467777551 ubuntu:14.04.4: Ok
  24 70.120014470 ubuntu:15.10: Ok
  25 69.392704717 ubuntu:16.04: Ok
  26 68.643732518 ubuntu:16.04-x-arm: Ok
  27 58.529762081 ubuntu:16.04-x-arm64: Ok
  28 57.908570394 ubuntu:16.04-x-powerpc: Ok
  29 58.354897750 ubuntu:16.04-x-powerpc64: Ok
  30 60.598809333 ubuntu:16.04-x-powerpc64el: Ok
  31 58.995355673 ubuntu:16.04-x-s390: Ok
  32 74.705277358 ubuntu:16.10: Ok
  
  real	33m47.198s
  user	0m2.009s
  sys	0m2.429s
  [root@...et ~]#

  [acme@...et linux]$ perf stat make -C tools/perf build-test
  make: Entering directory '/home/acme/git/linux/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
       make_util_pmu_bison_o_O: make util/pmu-bison.o
              make_no_libelf_O: make NO_LIBELF=1
                   make_pure_O: make
           make_no_libbionic_O: make NO_LIBBIONIC=1
             make_no_libperl_O: make NO_LIBPERL=1
            make_no_demangle_O: make NO_DEMANGLE=1
                make_no_gtk2_O: make NO_GTK2=1
              make_clean_all_O: make clean all
               make_no_slang_O: make NO_SLANG=1
             make_no_libnuma_O: make NO_LIBNUMA=1
                  make_debug_O: make DEBUG=1
             make_util_map_o_O: make util/map.o
           make_no_libpython_O: make NO_LIBPYTHON=1
                make_no_newt_O: make NO_NEWT=1
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
                 make_perf_o_O: make perf.o
         make_install_prefix_O: make install prefix=/tmp/krava
            make_install_bin_O: make install-bin
  make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
   make_install_prefix_slash_O: make install prefix=/tmp/krava/
                   make_help_O: make help
           make_no_backtrace_O: make NO_BACKTRACE=1
  - /home/acme/git/linux/tools/pD_TEST_FEATURE_DUMP_STATIC: cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP_STATIC  LDFLAGS='-static' feature-dump
  cd . && make FEATURE_DUMP_COPYcme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP_STATIC LDFLAGS='-static' feature-dump
                 make_static_O: make LDFLAGS=-static
                    make_doc_O: make doc
            make_no_auxtrace_O: make NO_AUXTRACE=1
                  make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
                   make_tags_O: make tags
        make_with_babeltrace_O: make LIBBABELTRACE=1
                make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1
                make_install_O: make install
           make_no_libunwind_O: make NO_LIBUNWIND=1
              make_no_libbpf_O: make NO_LIBBPF=1
            make_no_libaudit_O: make NO_LIBAUDIT=1
  OK
  make: Leaving directory '/home/acme/git/linux/tools/perf'
  [acme@...et linux]$

  [root@...et ~]# perf test
   1: vmlinux symtab matches kallsyms                          : Ok
   2: detect openat syscall event                              : Ok
   3: detect openat syscall event on all cpus                  : Ok
   4: read samples using the mmap interface                    : Ok
   5: parse events tests                                       : Ok
   6: Validate PERF_RECORD_* events & perf_sample fields       : Ok
   7: Test perf pmu format parsing                             : Ok
   8: Test dso data read                                       : Ok
   9: Test dso data cache                                      : Ok
  10: Test dso data reopen                                     : Ok
  11: roundtrip evsel->name check                              : Ok
  12: Check parsing of sched tracepoints fields                : Ok
  13: Generate and check syscalls:sys_enter_openat event fields: Ok
  14: struct perf_event_attr setup                             : Ok
  15: Test matching and linking multiple hists                 : Ok
  16: Try 'import perf' in python, checking link problems      : Ok
  17: Test breakpoint overflow signal handler                  : Ok
  18: Test breakpoint overflow sampling                        : Ok
  19: Test number of exit event of a simple workload           : Ok
  20: Test software clock events have valid period values      : Ok
  21: Test object code reading                                 : Ok
  22: Test sample parsing                                      : Ok
  23: Test using a dummy software event to keep tracking       : Ok
  24: Test parsing with no sample_id_all bit set               : Ok
  25: Test filtering hist entries                              : Ok
  26: Test mmap thread lookup                                  : Ok
  27: Test thread mg sharing                                   : Ok
  28: Test output sorting of hist entries                      : Ok
  29: Test cumulation of child hist entries                    : Ok
  30: Test tracking with sched_switch                          : Ok
  31: Filter fds with revents mask in a fdarray                : Ok
  32: Add fd to a fdarray, making it autogrow                  : Ok
  33: Test kmod_path__parse function                           : Ok
  34: Test thread map                                          : Ok
  35: Test LLVM searching and compiling                        :
  35.1: Basic BPF llvm compiling test                          : Ok
  35.2: Test kbuild searching                                  : Ok
  35.3: Compile source for BPF prologue generation test        : Ok
  35.4: Compile source for BPF relocation test                 : Ok
  36: Test topology in session                                 : Ok
  37: Test BPF filter                                          :
  37.1: Test basic BPF filtering                               : Ok
  37.2: Test BPF prologue generation                           : Ok
  37.3: Test BPF relocation checker                            : Ok
  38: Test thread map synthesize                               : Ok
  39: Test cpu map synthesize                                  : Ok
  40: Test stat config synthesize                              : Ok
  41: Test stat synthesize                                     : Ok
  42: Test stat round synthesize                               : Ok
  43: Test attr update synthesize                              : Ok
  44: Test events times                                        : Ok
  45: Test backward reading from ring buffer                   : Ok
  46: Test cpu map print                                       : Ok
  47: Test SDT event probing                                   : Ok
  48: Test is_printable_array function                         : Ok
  49: Test bitmap print                                        : Ok
  50: x86 rdpmc test                                           : Ok
  51: Test converting perf time to TSC                         : Ok
  52: Test dwarf unwind                                        : Ok
  53: Test x86 instruction decoder - new instructions          : Ok
  54: Test intel cqm nmi context read                          : Skip
  [root@...et ~]#

Powered by blists - more mailing lists