lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBT0kEMOnTw1-E1kezCvtWNSws7dp2qzCWKVmjGPOEga8Q@mail.gmail.com>
Date:   Mon, 3 Sep 2018 19:45:48 -0700
From:   Stephane Eranian <eranian@...gle.com>
To:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:     Jiri Olsa <jolsa@...hat.com>, Jiri Olsa <jolsa@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Namhyung Kim <namhyung@...nel.org>
Subject: [RFC] perf tool improvement requests

Hi Arnaldo, Jiri,

A few weeks ago, you had asked if I had more requests for the perf tool.
I have put together the following list to improve the usability of the
perf tool, at
least for our usage. Nothing is very big just small improvements here and there.

1/ perf stat interval printing

    Today, the timestamp printed via perf stat -I is relative to the
start of the measurements. It would be beneficial to also support a
mode where it is using a source which can be synchronized with other
traces  or profiles. For instance, using gettimeofday() or
clocktime(MONOTONIC).

 2/ perf report event grouping

  if you do:
  $ perf record -e '{ cycles, instructions, branches }' ....
  $ perf report
  It will show the 3 profiles together which is VERY useful. However
the output is confusing because it is hard to tell which % corresponds
to which event. I know it is cmdline order. But it would be good to
have a header in the columns to point to the events, instead of
guessing. A few times, I had to revert to perf report --header-only to
figure out the event order. I discovered the 'i' key on the function
profile. But it is still hard to find the events, especially if you
passed many of them.

  3/ annotate output of loops

Percent│401f00:   xor    %eax,%eax
            │401f02:   test   %edi,%edi
            │401f04: ↓ jle    401f2b <triad+0x2b>
            │401f06:   nopw   %cs:0x0(%rax,%rax,1)
  34.20 │401f1┌─→  movsd  (%rcx,%rax,8),%xmm1
  14.60 │401f1│:   mulsd  %xmm0,%xmm1
  33.24 │401f1│:   addsd  (%rdx,%rax,8),%xmm1
    9.98 │401f1│:   movsd  %xmm1,(%rsi,%rax,8)
    0.10 │401f2│:   add    $0x1,%rax
    0.03 │401f2├──  cmp    %eax,%edi
    7.84 │401f2└──↑ jg     401f10 <triad+0x10>
            │401f2b:   mov    $0x18,%eax
            │401f30: ← retq

    The loop arrows cut through the code addresses. That is annoying!

   4/ sorting and event groups

       If I do:
       $  perf record -e '{cycles,instructions}'
       $ perf report
       It will sort the samples based on the first (leader) of the
group. Yet here all events are sampling events. You could as well sort
with the second event. But I don't think perf report support sort
order on multiple events. Both are from the same category: syms (or
ip).

        Right now, I would have to collect another profile:
       $  perf record -e '{instructions,cycles}'
       $ perf report

   5) cgroups

    Today, to measure multiple group events in the same cgroup, you need to do:
     $ perf stat -e cycles,branch,instructions -G foo,foo,foo .....

     You need to specify the cgroup N-times for N-events. It would be
good to support a mode where you'd have to specify the cgroup once:

      $ perf stat -e cycles,branches,instructions --cgroup-all foo,bar

      Would measure cycles,branches,instructions for both cgroup foo and bar.


   6) perf script ip vs. callchain

     I already submitted this request separately. It is about
providing a way to generate the callchain separately from the ip in
perf script. Right now, they are lumped together which is not always
useful. Also right now, the callchain is a multi-line output which is
not useful. perf script should stick with one line per sample, at
least when symbolization is off. We have examples of that with
brstack.

I may have more requests but I wanted to start with these for now.
Thanks for your efforts.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ