lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 19 Mar 2019 11:52:28 -0300
From:   Arnaldo Carvalho de Melo <arnaldo.melo@...il.com>
To:     Song Liu <songliubraving@...com>
Cc:     "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
        Networking <netdev@...r.kernel.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Kernel Team <Kernel-team@...com>,
        Peter Zijlstra <peterz@...radead.org>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Stanislav Fomichev <sdf@...ichev.me>
Subject: [WORKS!] Re: [PATCH v9 perf,bpf 12/15] perf, bpf: enable annotation
 of bpf program

Em Tue, Mar 19, 2019 at 11:14:35AM -0300, Arnaldo Carvalho de Melo escreveu:
> 
> Now the test is really being performed, the test-all.bin feature
> detection fast path is working, proceeding.

Now it all works, really nice!

I added a 'Committer testing' section with all the steps to test this
and pushed to:

https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core&id=6530d44e7dc1d41c73fe70b9e6ea1b9cabebb57f

Which I repeat here, please go thru it and check if everything is ok, if
the disassembled + original BPF source code looks ok, etc:

    Committer testing:
    
    1) Have a BPF program running, one that has BTF info, etc, I used
       the tools/perf/examples/bpf/augmented_raw_syscalls.c put in place
       by 'perf trace'.
    
      # grep -B1 augmented_raw ~/.perfconfig
      [trace]
            add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
      #
      # perf trace -e *mmsg
      dnf/6245 sendmmsg(20, 0x7f5485a88030, 2, MSG_NOSIGNAL) = 2
      NetworkManager/10055 sendmmsg(22<socket:[1056822]>, 0x7f8126ad1bb0, 2, MSG_NOSIGNAL) = 2
    
    2) Then do a 'perf record' system wide for a while:
    
      # perf record -a
      ^C[ perf record: Woken up 68 times to write data ]
      [ perf record: Captured and wrote 19.427 MB perf.data (366891 samples) ]
      #
    
    3) Check that we captured BPF and BTF info in the perf.data file:
    
      # perf report --header-only | grep 'b[pt]f'
      # event : name = cycles:ppp, , id = { 294789, 294790, 294791, 294792, 294793, 294794, 294795, 294796 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
      # bpf_prog_info of id 13
      # bpf_prog_info of id 14
      # bpf_prog_info of id 15
      # bpf_prog_info of id 16
      # bpf_prog_info of id 17
      # bpf_prog_info of id 18
      # bpf_prog_info of id 21
      # bpf_prog_info of id 22
      # bpf_prog_info of id 41
      # bpf_prog_info of id 42
      # btf info of id 2
      #
    
    4) Check which programs got recorded:
    
       # perf report | grep bpf_prog | head
         0.16%  exe              bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
         0.14%  exe              bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
         0.08%  fuse-overlayfs   bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
         0.07%  fuse-overlayfs   bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
         0.01%  clang-4.0        bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
         0.01%  clang-4.0        bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
         0.00%  clang            bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
         0.00%  runc             bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
         0.00%  clang            bpf_prog_819967866022f1e1_sys_enter      [k] bpf_prog_819967866022f1e1_sys_enter
         0.00%  sh               bpf_prog_c1bd85c092d6e4aa_sys_exit       [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
      #
    
      This was with the default --sort order for 'perf report', which is:
    
        --sort comm,dso,symbol
    
      If we just look for the symbol, for instance:
    
       # perf report --sort symbol | grep bpf_prog | head
         0.26%  [k] bpf_prog_819967866022f1e1_sys_enter                -      -
         0.24%  [k] bpf_prog_c1bd85c092d6e4aa_sys_exit                 -      -
       #
    
      or the DSO:
    
       # perf report --sort dso | grep bpf_prog | head
         0.26%  bpf_prog_819967866022f1e1_sys_enter
         0.24%  bpf_prog_c1bd85c092d6e4aa_sys_exit
      #
    
    We'll see the two BPF programs that augmented_raw_syscalls.o puts in
    place,  one attached to the raw_syscalls:sys_enter and another to the
    raw_syscalls:sys_exit tracepoints, as expected.
    
    Now we can finally do, from the command line, annotation for one of
    those two symbols, with the original BPF program source coude intermixed
    with the disassembled JITed code:
    
      # perf annotate --stdio2 bpf_prog_819967866022f1e1_sys_enter
    
      Samples: 950  of event 'cycles:ppp', 4000 Hz, Event count (approx.): 553756947, [percent: local period]
      bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
      Percent      int sys_enter(struct syscall_enter_args *args)
       53.41         push   %rbp
    
        0.63         mov    %rsp,%rbp
        0.31         sub    $0x170,%rsp
        1.93         sub    $0x28,%rbp
        7.02         mov    %rbx,0x0(%rbp)
        3.20         mov    %r13,0x8(%rbp)
        1.07         mov    %r14,0x10(%rbp)
        0.61         mov    %r15,0x18(%rbp)
        0.11         xor    %eax,%eax
        1.29         mov    %rax,0x20(%rbp)
        0.11         mov    %rdi,%rbx
                    return bpf_get_current_pid_tgid();
        2.02       → callq  *ffffffffda6776d9
        2.76         mov    %eax,-0x148(%rbp)
                     mov    %rbp,%rsi
                   int sys_enter(struct syscall_enter_args *args)
                     add    $0xfffffffffffffeb8,%rsi
                    return bpf_map_lookup_elem(pids, &pid) != NULL;
                     movabs $0xffff975ac2607800,%rdi
    
        1.26       → callq  *ffffffffda6789e9
                     cmp    $0x0,%rax
        2.43       → je     0
                     add    $0x38,%rax
        0.21         xor    %r13d,%r13d
                    if (pid_filter__has(&pids_filtered, getpid()))
        0.81         cmp    $0x0,%rax
                   → jne    0
                     mov    %rbp,%rdi
                    probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
        2.22         add    $0xfffffffffffffeb8,%rdi
        0.11         mov    $0x40,%esi
        0.32         mov    %rbx,%rdx
        2.74       → callq  *ffffffffda658409
                    syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
        0.22         mov    %rbp,%rsi
        1.69         add    $0xfffffffffffffec0,%rsi
                    syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
                     movabs $0xffff975bfcd36000,%rdi
    
                     add    $0xd0,%rdi
        0.21         mov    0x0(%rsi),%eax
        0.93         cmp    $0x200,%rax
                   → jae    0
        0.10         shl    $0x3,%rax
    
        0.11         add    %rdi,%rax
        0.11       → jmp    0
                     xor    %eax,%eax
                    if (syscall == NULL || !syscall->enabled)
        1.07         cmp    $0x0,%rax
                   → je     0
                    if (syscall == NULL || !syscall->enabled)
        6.57         movzbq 0x0(%rax),%rdi
    
                    if (syscall == NULL || !syscall->enabled)
                     cmp    $0x0,%rdi
        0.95       → je     0
                     mov    $0x40,%r8d
                    switch (augmented_args.args.syscall_nr) {
                     mov    -0x140(%rbp),%rdi
                    switch (augmented_args.args.syscall_nr) {
                     cmp    $0x2,%rdi
                   → je     0
                     cmp    $0x101,%rdi
                   → je     0
                     cmp    $0x15,%rdi
                   → jne    0
                    case SYS_OPEN:   filename_arg = (const void *)args->args[0];
                     mov    0x10(%rbx),%rdx
                   → jmp    0
                    case SYS_OPENAT: filename_arg = (const void *)args->args[1];
                     mov    0x18(%rbx),%rdx
                    if (filename_arg != NULL) {
                     cmp    $0x0,%rdx
                   → je     0
                     xor    %edi,%edi
                            augmented_args.filename.reserved = 0;
                     mov    %edi,-0x104(%rbp)
                            augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                     mov    %rbp,%rdi
                     add    $0xffffffffffffff00,%rdi
                            augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                     mov    $0x100,%esi
                   → callq  *ffffffffda658499
                     mov    $0x148,%r8d
                            augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                     mov    %eax,-0x108(%rbp)
                            augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
                     mov    %rax,%rdi
                     shl    $0x20,%rdi
    
                     shr    $0x20,%rdi
    
                            if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) {
                     cmp    $0xff,%rdi
                   → ja     0
                                    len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size;
                     add    $0x48,%rax
                                    len &= sizeof(augmented_args.filename.value) - 1;
                     and    $0xff,%rax
                     mov    %rax,%r8
                     mov    %rbp,%rcx
                    return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, &augmented_args, len);
                     add    $0xfffffffffffffeb8,%rcx
                     mov    %rbx,%rdi
                     movabs $0xffff975fbd72d800,%rsi
    
                     mov    $0xffffffff,%edx
                   → callq  *ffffffffda658ad9
                     mov    %rax,%r13
                   }
                     mov    %r13,%rax
        0.72         mov    0x0(%rbp),%rbx
                     mov    0x8(%rbp),%r13
        1.16         mov    0x10(%rbp),%r14
        0.10         mov    0x18(%rbp),%r15
        0.42         add    $0x28,%rbp
        0.54         leaveq
        0.54       ← retq
      #
    
    Please see 'man perf-config' to see how to control what should be seen,
    via ~/.perfconfig [annotate] section, for instance, one can suppress the
    source code and see just the disassembly, etc.
    
    Alternatively, use the TUI bu just using 'perf annotate', press
    '/bpf_prog' to see the bpf symbols, press enter and do the interactive
    annotation, which allows for dumping to a file after selecting the
    the various output tunables, for instance, the above without source code
    intermixed, plus showing all the instruction offsets:
    
      # perf annotate bpf_prog_819967866022f1e1_sys_enter
    
    Then press: 's' to hide the source code + 'O' twice to show all
    instruction offsets, then 'P' to print to the
    bpf_prog_819967866022f1e1_sys_enter.annotation file, which will have:
    
      # cat bpf_prog_819967866022f1e1_sys_enter.annotation
      bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
      Event: cycles:ppp
    
       53.41    0:   push   %rbp
    
        0.63    1:   mov    %rsp,%rbp
        0.31    4:   sub    $0x170,%rsp
        1.93    b:   sub    $0x28,%rbp
        7.02    f:   mov    %rbx,0x0(%rbp)
        3.20   13:   mov    %r13,0x8(%rbp)
        1.07   17:   mov    %r14,0x10(%rbp)
        0.61   1b:   mov    %r15,0x18(%rbp)
        0.11   1f:   xor    %eax,%eax
        1.29   21:   mov    %rax,0x20(%rbp)
        0.11   25:   mov    %rdi,%rbx
        2.02   28: → callq  *ffffffffda6776d9
        2.76   2d:   mov    %eax,-0x148(%rbp)
               33:   mov    %rbp,%rsi
               36:   add    $0xfffffffffffffeb8,%rsi
               3d:   movabs $0xffff975ac2607800,%rdi
    
        1.26   47: → callq  *ffffffffda6789e9
               4c:   cmp    $0x0,%rax
        2.43   50: → je     0
               52:   add    $0x38,%rax
        0.21   56:   xor    %r13d,%r13d
        0.81   59:   cmp    $0x0,%rax
               5d: → jne    0
               63:   mov    %rbp,%rdi
        2.22   66:   add    $0xfffffffffffffeb8,%rdi
        0.11   6d:   mov    $0x40,%esi
        0.32   72:   mov    %rbx,%rdx
        2.74   75: → callq  *ffffffffda658409
        0.22   7a:   mov    %rbp,%rsi
        1.69   7d:   add    $0xfffffffffffffec0,%rsi
               84:   movabs $0xffff975bfcd36000,%rdi
    
               8e:   add    $0xd0,%rdi
        0.21   95:   mov    0x0(%rsi),%eax
        0.93   98:   cmp    $0x200,%rax
               9f: → jae    0
        0.10   a1:   shl    $0x3,%rax
    
        0.11   a5:   add    %rdi,%rax
        0.11   a8: → jmp    0
               aa:   xor    %eax,%eax
        1.07   ac:   cmp    $0x0,%rax
               b0: → je     0
        6.57   b6:   movzbq 0x0(%rax),%rdi
    
               bb:   cmp    $0x0,%rdi
        0.95   bf: → je     0
               c5:   mov    $0x40,%r8d
               cb:   mov    -0x140(%rbp),%rdi
               d2:   cmp    $0x2,%rdi
               d6: → je     0
               d8:   cmp    $0x101,%rdi
               df: → je     0
               e1:   cmp    $0x15,%rdi
               e5: → jne    0
               e7:   mov    0x10(%rbx),%rdx
               eb: → jmp    0
               ed:   mov    0x18(%rbx),%rdx
               f1:   cmp    $0x0,%rdx
               f5: → je     0
               f7:   xor    %edi,%edi
               f9:   mov    %edi,-0x104(%rbp)
               ff:   mov    %rbp,%rdi
              102:   add    $0xffffffffffffff00,%rdi
              109:   mov    $0x100,%esi
              10e: → callq  *ffffffffda658499
              113:   mov    $0x148,%r8d
              119:   mov    %eax,-0x108(%rbp)
              11f:   mov    %rax,%rdi
              122:   shl    $0x20,%rdi
    
              126:   shr    $0x20,%rdi
    
              12a:   cmp    $0xff,%rdi
              131: → ja     0
              133:   add    $0x48,%rax
              137:   and    $0xff,%rax
              13d:   mov    %rax,%r8
              140:   mov    %rbp,%rcx
              143:   add    $0xfffffffffffffeb8,%rcx
              14a:   mov    %rbx,%rdi
              14d:   movabs $0xffff975fbd72d800,%rsi
    
              157:   mov    $0xffffffff,%edx
              15c: → callq  *ffffffffda658ad9
              161:   mov    %rax,%r13
              164:   mov    %r13,%rax
        0.72  167:   mov    0x0(%rbp),%rbx
              16b:   mov    0x8(%rbp),%r13
        1.16  16f:   mov    0x10(%rbp),%r14
        0.10  173:   mov    0x18(%rbp),%r15
        0.42  177:   add    $0x28,%rbp
        0.54  17b:   leaveq
        0.54  17c: ← retq
    
    Another cool way to test all this is to symple use 'perf top' look for
    those symbols, go there and press enter, annotate it live :-)
    
    Signed-off-by: Song Liu <songliubraving@...com>
    Reviewed-by: Jiri Olsa <jolsa@...nel.org>
    Tested-by: Arnaldo Carvalho de Melo <acme@...hat.com>
    Cc: Alexei Starovoitov <ast@...nel.org>
    Cc: Daniel Borkmann <daniel@...earbox.net>
    Cc: Namhyung Kim <namhyung@...nel.org>
    Cc: Peter Zijlstra <peterz@...radead.org>
    Cc: Stanislav Fomichev <sdf@...gle.com>
    Link: http://lkml.kernel.org/r/20190312053051.2690567-13-songliubraving@fb.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ