netdev - Re: [PATCH bpf-next v2 7/7] tools/bpftool: add perf subcommand

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAH3MdRWqwOGrE03wo3pqd-shNvETMEf7jObFMndgDuH6Mz=y5A@mail.gmail.com>
Date:   Fri, 18 May 2018 15:13:26 -0700
From:   Y Song <ys114321@...il.com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
Cc:     Yonghong Song <yhs@...com>,
        Quentin Monnet <quentin.monnet@...ronome.com>,
        peterz@...radead.org, Alexei Starovoitov <ast@...com>,
        Daniel Borkmann <daniel@...earbox.net>,
        netdev <netdev@...r.kernel.org>, kernel-team@...com
Subject: Re: [PATCH bpf-next v2 7/7] tools/bpftool: add perf subcommand

On Fri, May 18, 2018 at 1:51 PM, Jakub Kicinski
<jakub.kicinski@...ronome.com> wrote:
> On Thu, 17 May 2018 22:03:10 -0700, Yonghong Song wrote:
>> The new command "bpftool perf [show | list]" will traverse
>> all processes under /proc, and if any fd is associated
>> with a perf event, it will print out related perf event
>> information. Documentation is also added.
>
> Thanks for the changes, it looks good with some minor nits which can be
> addressed as follow up if there is no other need to respin.  Please
> consider it:
>
> Reviewed-by: Jakub Kicinski <jakub.kicinski@...ronome.com>

Most likely will need respin. Will make suggested changes then.

>
>> Below is an example to show the results using bcc commands.
>> Running the following 4 bcc commands:
>>   kprobe:     trace.py '__x64_sys_nanosleep'
>>   kretprobe:  trace.py 'r::__x64_sys_nanosleep'
>>   tracepoint: trace.py 't:syscalls:sys_enter_nanosleep'
>>   uprobe:     trace.py 'p:/home/yhs/a.out:main'
>>
>> The bpftool command line and result:
>>
>>   $ bpftool perf
>>   pid 21711  fd 5: prog_id 5  kprobe  func __x64_sys_write  offset 0
>>   pid 21765  fd 5: prog_id 7  kretprobe  func __x64_sys_nanosleep  offset 0
>>   pid 21767  fd 5: prog_id 8  tracepoint  sys_enter_nanosleep
>>   pid 21800  fd 5: prog_id 9  uprobe  filename /home/yhs/a.out  offset 1159
>>
>>   $ bpftool -j perf
>>   {"pid":21711,"fd":5,"prog_id":5,"attach_info":"kprobe","func":"__x64_sys_write","offset":0}, \
>>   {"pid":21765,"fd":5,"prog_id":7,"attach_info":"kretprobe","func":"__x64_sys_nanosleep","offset":0}, \
>>   {"pid":21767,"fd":5,"prog_id":8,"attach_info":"tracepoint","tracepoint":"sys_enter_nanosleep"}, \
>>   {"pid":21800,"fd":5,"prog_id":9,"attach_info":"uprobe","filename":"/home/yhs/a.out","offset":1159}
>
> nit: this is now an array

Sorry, this is probably updated in middle of work. Will make the change in
the next revision.

>
>>   $ bpftool prog
>>   5: kprobe  name probe___x64_sys  tag e495a0c82f2c7a8d  gpl
>>         loaded_at 2018-05-15T04:46:37-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 4
>>   7: kprobe  name probe___x64_sys  tag f2fdee479a503abf  gpl
>>         loaded_at 2018-05-15T04:48:32-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 7
>>   8: tracepoint  name tracepoint__sys  tag 5390badef2395fcf  gpl
>>         loaded_at 2018-05-15T04:48:48-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 8
>>   9: kprobe  name probe_main_1  tag 0a87bdc2e2953b6d  gpl
>>         loaded_at 2018-05-15T04:49:52-0700  uid 0
>>         xlated 200B  not jited  memlock 4096B  map_ids 9
>>
>>   $ ps ax | grep "python ./trace.py"
>>   21711 pts/0    T      0:03 python ./trace.py __x64_sys_write
>>   21765 pts/0    S+     0:00 python ./trace.py r::__x64_sys_nanosleep
>>   21767 pts/2    S+     0:00 python ./trace.py t:syscalls:sys_enter_nanosleep
>>   21800 pts/3    S+     0:00 python ./trace.py p:/home/yhs/a.out:main
>>   22374 pts/1    S+     0:00 grep --color=auto python ./trace.py
>>
>> Signed-off-by: Yonghong Song <yhs@...com>
>
>> diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
>> index b301c9b..3680ad4 100644
>> --- a/tools/bpf/bpftool/bash-completion/bpftool
>> +++ b/tools/bpf/bpftool/bash-completion/bpftool
>> @@ -448,6 +448,15 @@ _bpftool()
>>                      ;;
>>              esac
>>              ;;
>> +        cgroup)
>
> s/cgroup/perf/ :)

A mistake in my side to consolidate different version of code.
I did have "perf" in one of my versions and tested it properly.

>
>> +            case $command in
>> +                *)
>> +                    [[ $prev == $object ]] && \
>> +                        COMPREPLY=( $( compgen -W 'help \
>> +                            show list' -- "$cur" ) )
>> +                    ;;
>> +            esac
>> +            ;;
>>      esac
>>  } &&
>>  complete -F _bpftool bpftool
>
>> +static int show_proc(const char *fpath, const struct stat *sb,
>> +                  int tflag, struct FTW *ftwbuf)
>> +{
>> +     __u64 probe_offset, probe_addr;
>> +     __u32 prog_id, attach_info;
>> +     int err, pid = 0, fd = 0;
>> +     const char *pch;
>> +     char buf[4096];
>> +
>> +     /* prefix always /proc */
>> +     pch = fpath + 5;
>> +     if (*pch == '\0')
>> +             return 0;
>> +
>> +     /* pid should be all numbers */
>> +     pch++;
>> +     while (isdigit(*pch)) {
>> +             pid = pid * 10 + *pch - '0';
>> +             pch++;
>> +     }
>> +     if (*pch == '\0')
>> +             return 0;
>> +     if (*pch != '/')
>> +             return FTW_SKIP_SUBTREE;
>> +
>> +     /* check /proc/<pid>/fd directory */
>> +     pch++;
>> +     if (strncmp(pch, "fd", 2))
>> +             return FTW_SKIP_SUBTREE;
>> +     pch += 2;
>> +     if (*pch == '\0')
>> +             return 0;
>> +     if (*pch != '/')
>> +             return FTW_SKIP_SUBTREE;
>> +
>> +     /* check /proc/<pid>/fd/<fd_num> */
>> +     pch++;
>> +     while (isdigit(*pch)) {
>> +             fd = fd * 10 + *pch - '0';
>> +             pch++;
>> +     }
>> +     if (*pch != '\0')
>> +             return FTW_SKIP_SUBTREE;
>> +
>> +     /* query (pid, fd) for potential perf events */
>> +     err = bpf_task_fd_query(pid, fd, 0, buf, sizeof(buf), &prog_id,
>> +                             &attach_info, &probe_offset, &probe_addr);
>> +     if (err < 0)
>> +             return 0;
>
> nit: it could be nice from user perspective to detect whether kernel
>      supports the command and fail if not.  Otherwise user is not sure
>      if there is no output because kernel lacks support or because
>      there were really no attached progs.  Just a thought, not really
>      a requirement.

I agree with you. it is good to output an error if the kernel does not
support the syscall (e.g., either non-root or new subcommand is not
supported).

>
>> +     if (json_output)
>> +             print_perf_json(pid, fd, prog_id, attach_info, buf, probe_offset,
>> +                             probe_addr);
>> +     else
>> +             print_perf_plain(pid, fd, prog_id, attach_info, buf, probe_offset,
>> +                              probe_addr);
>> +
>> +     return 0;
>> +}
>> +
>> +static int do_show(int argc, char **argv)
>> +{
>> +     int err = 0, nopenfd = 16;
>> +     int flags = FTW_ACTIONRETVAL | FTW_PHYS;
>
> nit: reverse xmas tree

Will make the change in the next revision.

>
>> +     if (json_output)
>> +             jsonw_start_array(json_wtr);
>> +     if (nftw("/proc", show_proc, nopenfd, flags) == -1) {
>> +             p_err("%s", strerror(errno));
>> +             err = -1;
>> +     }
>> +     if (json_output)
>> +             jsonw_end_array(json_wtr);
>> +
>> +     return err;
>> +}