lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 11 Oct 2016 14:30:55 -0300 From: Arnaldo Carvalho de Melo <acme@...nel.org> To: Ingo Molnar <mingo@...nel.org> Cc: linux-kernel@...r.kernel.org, Linux Weekly News <lwn@....net>, Arnaldo Carvalho de Melo <acme@...hat.com>, Adrian Hunter <adrian.hunter@...el.com>, Andi Kleen <ak@...ux.intel.com>, David Ahern <dsahern@...il.com>, Don Zickus <dzickus@...hat.com>, Jiri Olsa <jolsa@...nel.org>, Joe Mario <jmario@...hat.com>, Namhyung Kim <namhyung@...nel.org>, Peter Zijlstra <a.p.zijlstra@...llo.nl>, pi3orama@....com, Steven Rostedt <rostedt@...dmis.org>, Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>, Wang Nan <wangnan0@...wei.com>, Zefan Li <lizefan@...wei.com> Subject: [GIT PULL 00/68] perf/core improvements and fixes From: Arnaldo Carvalho de Melo <acme@...hat.com> Hi Ingo, Please consider pulling, - Arnaldo Build and test stats at the end of the message. The following changes since commit c68306ce20ad03ce655a367fc33ad06e12bb87a6: Merge tag 'perf-core-for-mingo-20161005' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2016-10-07 00:36:49 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161011 for you to fetch changes up to 193b29e31a5cfec42790a59fc453359bb6ee0ea1: perf jevents: Handle events including .c and .o (2016-10-11 12:34:39 -0300) ---------------------------------------------------------------- perf/core improvements and fixes: New features: - The 'perf c2c' tool provides means for Shared Data C2C/HITM analysis. It allows you to track down cacheline contention. The tool is based on x86's load latency and precise store facility events provided by Intel CPUs. It was tested by Joe Mario and has proven to be useful, finding som cacheline contentions. Joe also wrote a blog about c2c tool with examples: https://joemario.github.io/blog/2016/09/01/c2c-blog/ There one finds extensive details on using the tool, with tips on reducing the volume of samples while still capturing enough to do its job. (Dick Fowles, Joe Mario, Don Zickus, Jiri Olsa) - Add support in 'perf list' to show only events in vendor notation, built from JSON (Andi Kleen) - Handle completion of upper case events, as users of the JSON events are used to. Using it as lowercase also works. (Andi Kleen) - Report Intel-PT/BTS instruction bytes in 'perf script' (Andi Kleen) Fixes: - Fix handling of numa nodes in perf.data files (Jiri Olsa) - Fix scrolling when refreshing 'perf top --tui --hierarchy' entries (Namhyung Kim) - Fix handling of events including .c and .o, that were being treated as BPF scripts instead of JSON ones (Wang Nan) Infrastructure: - Sync copy of x86's syscall table (Arnaldo Carvalho de Melo) - prep work for making libtraceevent more widely used (Jiri Olsa) - Show list of features not present in a perf.data file when using 'perf report --header-only', to help with debugging (Jiri Olsa) - When failing to process a record, show its name, not its number (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com> ---------------------------------------------------------------- Adrian Hunter (1): perf intel-pt/bts: Tidy instruction buffer size usage Andi Kleen (3): perf list: Add support for listing only json events perf tools: Handle completion of upper case events perf intel-pt/bts: Report instruction bytes and length in sample Arnaldo Carvalho de Melo (1): perf tools: Sync copy of x86's syscall table Jiri Olsa (61): perf c2c: Introduce c2c_decode_stats function perf c2c: Introduce c2c_add_stats function perf c2c: Add c2c command perf c2c: Add record subcommand perf c2c: Add report subcommand perf c2c report: Add dimension support perf c2c report: Add sort_entry dimension support perf c2c report: Fallback to standard dimensions perf c2c report: Add sample processing perf c2c report: Add cacheline hists processing perf c2c report: Decode c2c_stats for hist entries perf c2c report: Add header macros perf c2c report: Add 'dcacheline' dimension key perf c2c report: Add 'offset' dimension key perf c2c report: Add 'iaddr' dimension key perf c2c report: Add hitm related dimension keys perf c2c report: Add stores related dimension keys perf c2c report: Add loads related dimension keys perf c2c report: Add llc and remote loads related dimension keys perf c2c report: Add llc load miss dimension key perf c2c report: Add total record sort key perf c2c report: Add total loads sort key perf c2c report: Add hitm percent sort key perf c2c report: Add hitm/store percent related sort keys perf c2c report: Add dram related sort keys perf c2c report: Add 'pid' sort key perf c2c report: Add 'tid' sort key perf c2c report: Add 'symbol' and 'dso' sort keys perf c2c report: Add 'node' sort key perf c2c report: Add stats related sort keys perf c2c report: Add 'cpucnt' sort key perf c2c report: Add src line sort key perf c2c report: Setup number of header lines for hists perf c2c report: Set final resort fields perf c2c report: Add stdio output support perf c2c report: Add main TUI browser perf c2c report: Add TUI cacheline browser perf c2c report: Add global stats stdio output perf c2c report: Add shared cachelines stats stdio output perf c2c report: Add c2c related stats stdio output perf c2c report: Allow to report callchains perf c2c report: Limit the cachelines table entries perf c2c report: Add support to choose local HITMs perf c2c report: Allow to set cacheline sort fields perf c2c report: Recalc width of global sort entries perf c2c report: Add cacheline index entry perf c2c report: Add support to manage symbol name length perf c2c report: Iterate node display in browser perf c2c report: Add help windows perf c2c: Add man page and credits tools lib traceevent: Add install_headers target tools lib traceevent: Add do_install_mkdir Makefile function tools lib traceevent: Rename LIB_FILE to LIB_TARGET tools lib traceevent: Add version for traceevent shared object tools lib: Add for_each_clear_bit macro perf report: Move captured info to generic header info perf header: Display missing features perf header: Display feature name on write failure perf header: Set nr_numa_nodes only when we parsed all the data perf c2c report: Add --no-source option perf c2c report: Add --show-all option Namhyung Kim (1): perf top: Fix refreshing hierarchy entries on TUI Wang Nan (1): perf jevents: Handle events including .c and .o tools/include/asm-generic/bitops.h | 1 + tools/include/asm-generic/bitops/__ffz.h | 12 + tools/include/asm-generic/bitops/find.h | 28 + tools/include/linux/bitops.h | 5 + tools/lib/find_bit.c | 25 + tools/lib/traceevent/Makefile | 40 +- tools/perf/Build | 1 + tools/perf/Documentation/perf-c2c.txt | 282 ++ tools/perf/Documentation/perf-list.txt | 2 +- tools/perf/MANIFEST | 1 + tools/perf/arch/x86/entry/syscalls/syscall_64.tbl | 4 +- tools/perf/builtin-c2c.c | 2754 ++++++++++++++++++++ tools/perf/builtin-list.c | 9 +- tools/perf/builtin.h | 1 + tools/perf/perf-completion.sh | 6 +- tools/perf/perf.c | 1 + tools/perf/ui/browsers/hists.c | 5 +- tools/perf/ui/browsers/hists.h | 1 + tools/perf/util/event.h | 3 + tools/perf/util/header.c | 21 +- tools/perf/util/hist.c | 1 + tools/perf/util/hist.h | 1 + tools/perf/util/intel-bts.c | 9 +- .../perf/util/intel-pt-decoder/intel-pt-decoder.c | 2 + .../perf/util/intel-pt-decoder/intel-pt-decoder.h | 1 + .../util/intel-pt-decoder/intel-pt-insn-decoder.c | 13 +- .../util/intel-pt-decoder/intel-pt-insn-decoder.h | 6 +- tools/perf/util/intel-pt-decoder/intel-pt-log.c | 4 +- tools/perf/util/intel-pt.c | 19 +- tools/perf/util/mem-events.c | 128 + tools/perf/util/mem-events.h | 37 + tools/perf/util/parse-events.c | 2 +- tools/perf/util/parse-events.l | 4 +- tools/perf/util/pmu.c | 14 +- tools/perf/util/pmu.h | 3 +- tools/perf/util/session.c | 10 - tools/perf/util/sort.c | 2 +- tools/perf/util/sort.h | 1 + 38 files changed, 3393 insertions(+), 66 deletions(-) create mode 100644 tools/include/asm-generic/bitops/__ffz.h create mode 100644 tools/perf/Documentation/perf-c2c.txt create mode 100644 tools/perf/builtin-c2c.c [root@...et ~]# time dm 1 66.368836810 alpine:3.4: Ok 2 26.154146190 android-ndk:r12b-arm: Ok 3 69.746739126 archlinux:latest: Ok 4 39.624220291 centos:5: Ok 5 58.689782208 centos:6: Ok 6 69.851635081 centos:7: Ok 7 63.079827869 debian:7: Ok 8 68.955435266 debian:8: Ok 9 38.571431258 debian:experimental: Ok 10 69.558879497 fedora:20: Ok 11 73.092759654 fedora:21: Ok 12 72.443082285 fedora:22: Ok 13 72.305159323 fedora:23: Ok 14 77.316048256 fedora:24: Ok 15 32.774333511 fedora:24-x-ARC-uClibc: Ok 16 80.985293289 fedora:rawhide: Ok 17 79.388121697 mageia:5: Ok 18 72.485900821 opensuse:13.2: Ok 19 73.519405793 opensuse:42.1: Ok 20 81.367665352 opensuse:tumbleweed: Ok 21 56.263699207 ubuntu:12.04.5: Ok 22 38.300297066 ubuntu:14.04: Ok 23 68.467777551 ubuntu:14.04.4: Ok 24 70.120014470 ubuntu:15.10: Ok 25 69.392704717 ubuntu:16.04: Ok 26 68.643732518 ubuntu:16.04-x-arm: Ok 27 58.529762081 ubuntu:16.04-x-arm64: Ok 28 57.908570394 ubuntu:16.04-x-powerpc: Ok 29 58.354897750 ubuntu:16.04-x-powerpc64: Ok 30 60.598809333 ubuntu:16.04-x-powerpc64el: Ok 31 58.995355673 ubuntu:16.04-x-s390: Ok 32 74.705277358 ubuntu:16.10: Ok real 33m47.198s user 0m2.009s sys 0m2.429s [root@...et ~]# [acme@...et linux]$ perf stat make -C tools/perf build-test make: Entering directory '/home/acme/git/linux/tools/perf' - tarpkg: ./tests/perf-targz-src-pkg . make_util_pmu_bison_o_O: make util/pmu-bison.o make_no_libelf_O: make NO_LIBELF=1 make_pure_O: make make_no_libbionic_O: make NO_LIBBIONIC=1 make_no_libperl_O: make NO_LIBPERL=1 make_no_demangle_O: make NO_DEMANGLE=1 make_no_gtk2_O: make NO_GTK2=1 make_clean_all_O: make clean all make_no_slang_O: make NO_SLANG=1 make_no_libnuma_O: make NO_LIBNUMA=1 make_debug_O: make DEBUG=1 make_util_map_o_O: make util/map.o make_no_libpython_O: make NO_LIBPYTHON=1 make_no_newt_O: make NO_NEWT=1 make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1 make_perf_o_O: make perf.o make_install_prefix_O: make install prefix=/tmp/krava make_install_bin_O: make install-bin make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1 make_install_prefix_slash_O: make install prefix=/tmp/krava/ make_help_O: make help make_no_backtrace_O: make NO_BACKTRACE=1 - /home/acme/git/linux/tools/pD_TEST_FEATURE_DUMP_STATIC: cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP_STATIC LDFLAGS='-static' feature-dump cd . && make FEATURE_DUMP_COPYcme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP_STATIC LDFLAGS='-static' feature-dump make_static_O: make LDFLAGS=-static make_doc_O: make doc make_no_auxtrace_O: make NO_AUXTRACE=1 make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 make_tags_O: make tags make_with_babeltrace_O: make LIBBABELTRACE=1 make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 make_install_O: make install make_no_libunwind_O: make NO_LIBUNWIND=1 make_no_libbpf_O: make NO_LIBBPF=1 make_no_libaudit_O: make NO_LIBAUDIT=1 OK make: Leaving directory '/home/acme/git/linux/tools/perf' [acme@...et linux]$ [root@...et ~]# perf test 1: vmlinux symtab matches kallsyms : Ok 2: detect openat syscall event : Ok 3: detect openat syscall event on all cpus : Ok 4: read samples using the mmap interface : Ok 5: parse events tests : Ok 6: Validate PERF_RECORD_* events & perf_sample fields : Ok 7: Test perf pmu format parsing : Ok 8: Test dso data read : Ok 9: Test dso data cache : Ok 10: Test dso data reopen : Ok 11: roundtrip evsel->name check : Ok 12: Check parsing of sched tracepoints fields : Ok 13: Generate and check syscalls:sys_enter_openat event fields: Ok 14: struct perf_event_attr setup : Ok 15: Test matching and linking multiple hists : Ok 16: Try 'import perf' in python, checking link problems : Ok 17: Test breakpoint overflow signal handler : Ok 18: Test breakpoint overflow sampling : Ok 19: Test number of exit event of a simple workload : Ok 20: Test software clock events have valid period values : Ok 21: Test object code reading : Ok 22: Test sample parsing : Ok 23: Test using a dummy software event to keep tracking : Ok 24: Test parsing with no sample_id_all bit set : Ok 25: Test filtering hist entries : Ok 26: Test mmap thread lookup : Ok 27: Test thread mg sharing : Ok 28: Test output sorting of hist entries : Ok 29: Test cumulation of child hist entries : Ok 30: Test tracking with sched_switch : Ok 31: Filter fds with revents mask in a fdarray : Ok 32: Add fd to a fdarray, making it autogrow : Ok 33: Test kmod_path__parse function : Ok 34: Test thread map : Ok 35: Test LLVM searching and compiling : 35.1: Basic BPF llvm compiling test : Ok 35.2: Test kbuild searching : Ok 35.3: Compile source for BPF prologue generation test : Ok 35.4: Compile source for BPF relocation test : Ok 36: Test topology in session : Ok 37: Test BPF filter : 37.1: Test basic BPF filtering : Ok 37.2: Test BPF prologue generation : Ok 37.3: Test BPF relocation checker : Ok 38: Test thread map synthesize : Ok 39: Test cpu map synthesize : Ok 40: Test stat config synthesize : Ok 41: Test stat synthesize : Ok 42: Test stat round synthesize : Ok 43: Test attr update synthesize : Ok 44: Test events times : Ok 45: Test backward reading from ring buffer : Ok 46: Test cpu map print : Ok 47: Test SDT event probing : Ok 48: Test is_printable_array function : Ok 49: Test bitmap print : Ok 50: x86 rdpmc test : Ok 51: Test converting perf time to TSC : Ok 52: Test dwarf unwind : Ok 53: Test x86 instruction decoder - new instructions : Ok 54: Test intel cqm nmi context read : Skip [root@...et ~]#
Powered by blists - more mailing lists