netdev - Re: [PATCH bpf-next v1 15/19] tools/libbpf: add bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d27a42f2-47a6-e12b-56c0-13c447ce15d1@fb.com>
Date:   Sat, 2 May 2020 00:17:24 -0700
From:   Yonghong Song <yhs@...com>
To:     Andrii Nakryiko <andrii.nakryiko@...il.com>
CC:     Andrii Nakryiko <andriin@...com>, bpf <bpf@...r.kernel.org>,
        Martin KaFai Lau <kafai@...com>,
        Networking <netdev@...r.kernel.org>,
        Alexei Starovoitov <ast@...com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next v1 15/19] tools/libbpf: add bpf_iter support



On 4/29/20 6:41 PM, Andrii Nakryiko wrote:
> On Mon, Apr 27, 2020 at 1:17 PM Yonghong Song <yhs@...com> wrote:
>>
>> Three new libbpf APIs are added to support bpf_iter:
>>    - bpf_program__attach_iter
>>      Given a bpf program and additional parameters, which is
>>      none now, returns a bpf_link.
>>    - bpf_link__create_iter
>>      Given a bpf_link, create a bpf_iter and return a fd
>>      so user can then do read() to get seq_file output data.
>>    - bpf_iter_create
>>      syscall level API to create a bpf iterator.
>>
>> Two macros, BPF_SEQ_PRINTF0 and BPF_SEQ_PRINTF, are also introduced.
>> These two macros can help bpf program writers with
>> nicer bpf_seq_printf syntax similar to the kernel one.
>>
>> Signed-off-by: Yonghong Song <yhs@...com>
>> ---
>>   tools/lib/bpf/bpf.c         | 11 +++++++
>>   tools/lib/bpf/bpf.h         |  2 ++
>>   tools/lib/bpf/bpf_tracing.h | 23 ++++++++++++++
>>   tools/lib/bpf/libbpf.c      | 60 +++++++++++++++++++++++++++++++++++++
>>   tools/lib/bpf/libbpf.h      | 11 +++++++
>>   tools/lib/bpf/libbpf.map    |  7 +++++
>>   6 files changed, 114 insertions(+)
>>
>> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
>> index 5cc1b0785d18..7ffd6c0ad95f 100644
>> --- a/tools/lib/bpf/bpf.c
>> +++ b/tools/lib/bpf/bpf.c
>> @@ -619,6 +619,17 @@ int bpf_link_update(int link_fd, int new_prog_fd,
>>          return sys_bpf(BPF_LINK_UPDATE, &attr, sizeof(attr));
>>   }
>>
>> +int bpf_iter_create(int link_fd, unsigned int flags)
> 
> Do you envision anything more than just flags being passed for
> bpf_iter_create? I wonder if we should just go ahead with options
> struct here?

I think most, if not all, parameters should go to link create.
This way, we can have the identical anon_iter through:
    link -> anon_iter
    link -> pinned file -> anon_iter

I do not really expect any more fields for bpf_iter_create.
The flags here is for potential future extension, which I
have no idea how it looks like.

> 
>> +{
>> +       union bpf_attr attr;
>> +
>> +       memset(&attr, 0, sizeof(attr));
>> +       attr.iter_create.link_fd = link_fd;
>> +       attr.iter_create.flags = flags;
>> +
>> +       return sys_bpf(BPF_ITER_CREATE, &attr, sizeof(attr));
>> +}
>> +
> 
> [...]
> 
>> +/*
>> + * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
>> + * in a structure. BPF_SEQ_PRINTF0 is a simple wrapper for
>> + * bpf_seq_printf().
>> + */
>> +#define BPF_SEQ_PRINTF0(seq, fmt)                                      \
>> +       ({                                                              \
>> +               int ret = bpf_seq_printf(seq, fmt, sizeof(fmt),         \
>> +                                        (void *)0, 0);                 \
>> +               ret;                                                    \
>> +       })
>> +
>> +#define BPF_SEQ_PRINTF(seq, fmt, args...)                              \
> 
> You can unify BPF_SEQ_PRINTF and BPF_SEQ_PRINTF0 by using
> ___bpf_empty() macro. See bpf_core_read.h for similar use case.
> Specifically, look at ___empty (equivalent of ___bpf_empty) and
> ___core_read, ___core_read0, ___core_readN macro.

Thanks for the tip. Will try.

> 
>> +       ({                                                              \
>> +               _Pragma("GCC diagnostic push")                          \
>> +               _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")  \
>> +               __u64 param[___bpf_narg(args)] = { args };              \
> 
> Do you need to provide the size of array here? If you omit
> __bpf_narg(args), wouldn't compiler automatically calculate the right
> size?
> 

Yes, compiler should calculate correct size.

> Also, can you please use "unsigned long long" to not have any implicit
> dependency on __u64 being defined?

Will do.

> 
>> +               _Pragma("GCC diagnostic pop")                           \
>> +               int ret = bpf_seq_printf(seq, fmt, sizeof(fmt),         \
>> +                                        param, sizeof(param));         \
>> +               ret;                                                    \
>> +       })
>> +
>>   #endif
>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>> index 8e1dc6980fac..ffdc4d8e0cc0 100644
>> --- a/tools/lib/bpf/libbpf.c
>> +++ b/tools/lib/bpf/libbpf.c
>> @@ -6366,6 +6366,9 @@ static const struct bpf_sec_def section_defs[] = {
>>                  .is_attach_btf = true,
>>                  .expected_attach_type = BPF_LSM_MAC,
>>                  .attach_fn = attach_lsm),
>> +       SEC_DEF("iter/", TRACING,
>> +               .expected_attach_type = BPF_TRACE_ITER,
>> +               .is_attach_btf = true),
> 
> It would be nice to implement auto-attach capabilities (similar to
> fentry/fexit, lsm and raw_tracepoint). Section name should have enough
> information for this, no?

In the current form, yes, auto attach is possible.
But I am thinking we may soon have additional information
like map_id (appear in link_create) etc.
to make auto attach not possible. That is why
I implemented an explicit attach. is this assessment correct?

> 
>>          BPF_PROG_SEC("xdp",                     BPF_PROG_TYPE_XDP),
>>          BPF_PROG_SEC("perf_event",              BPF_PROG_TYPE_PERF_EVENT),
>>          BPF_PROG_SEC("lwt_in",                  BPF_PROG_TYPE_LWT_IN),
>> @@ -6629,6 +6632,7 @@ static int bpf_object__collect_struct_ops_map_reloc(struct bpf_object *obj,
>>
> 
> [...]
> 
>> +
>> +       link = calloc(1, sizeof(*link));
>> +       if (!link)
>> +               return ERR_PTR(-ENOMEM);
>> +       link->detach = &bpf_link__detach_fd;
>> +
>> +       attach_type = bpf_program__get_expected_attach_type(prog);
> 
> Given you know it has to be BPF_TRACE_ITER, it's better to explicitly
> specify that. If provided program wasn't loaded with correct
> expected_attach_type, kernel will reject it. But if you don't do it,
> then you can accidentally create some other type of bpf_link.

Yes, will do.

> 
>> +       link_fd = bpf_link_create(prog_fd, 0, attach_type, NULL);
>> +       if (link_fd < 0) {
>> +               link_fd = -errno;
>> +               free(link);
>> +               pr_warn("program '%s': failed to attach to iterator: %s\n",
>> +                       bpf_program__title(prog, false),
>> +                       libbpf_strerror_r(link_fd, errmsg, sizeof(errmsg)));
>> +               return ERR_PTR(link_fd);
>> +       }
>> +       link->fd = link_fd;
>> +       return link;
>> +}
>> +
>> +int bpf_link__create_iter(struct bpf_link *link, unsigned int flags)
>> +{
> 
> Same question as for low-level bpf_link_create(). If we expect the
> need to extend optional things in the future, I'd add opts right now.
> 
> But I wonder if bpf_link__create_iter() provides any additional value
> beyond bpf_iter_create(). Maybe let's not add it (yet)?

The only additional thing is better warning messsage.
Agree that is so marginal. Will drop it.

> 
>> +       char errmsg[STRERR_BUFSIZE];
>> +       int iter_fd;
>> +
>> +       iter_fd = bpf_iter_create(bpf_link__fd(link), flags);
>> +       if (iter_fd < 0) {
>> +               iter_fd = -errno;
>> +               pr_warn("failed to create an iterator: %s\n",
>> +                       libbpf_strerror_r(iter_fd, errmsg, sizeof(errmsg)));
>> +       }
>> +
>> +       return iter_fd;
>> +}
>> +
> 
> [...]
>