netdev - Re: explicit maps. Was: [RFC PATCH bpf-next 6/8] libbpf: allow specifying map definitions using BTF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190610141528.38c71524@cakuba.netronome.com>
Date:   Mon, 10 Jun 2019 14:15:28 -0700
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     Alexei Starovoitov <ast@...com>
Cc:     Andrii Nakryiko <andrii.nakryiko@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Stanislav Fomichev <sdf@...ichev.me>,
        Andrii Nakryiko <andriin@...com>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>, Yonghong Song <yhs@...com>
Subject: Re: explicit maps. Was: [RFC PATCH bpf-next 6/8] libbpf: allow
 specifying map definitions using BTF

On Mon, 10 Jun 2019 01:17:13 +0000, Alexei Starovoitov wrote:
> On 6/6/19 6:02 PM, Jakub Kicinski wrote:
> > On Fri, 7 Jun 2019 00:27:52 +0000, Alexei Starovoitov wrote:  
> >> the solution we're discussing should solve BPF_ANNOTATE_KV_PAIR too.
> >> That hack must go.  
> > 
> > I see.
> >   
> >> If I understood your objections to Andrii's format is that
> >> you don't like pointer part of key/value while Andrii explained
> >> why we picked the pointer, right?
> >>
> >> So how about:
> >>
> >> struct {
> >>     int type;
> >>     int max_entries;
> >>     struct {
> >>       __u32 key;
> >>       struct my_value value;
> >>     } types[];
> >> } ...  
> > 
> > My objection is that k/v fields are never initialized, so they're
> > "metafields", mixed with real fields which hold parameters - like
> > type, max_entries etc.  
> 
> I don't share this meta fields vs real fields distinction.
> All of the fields are meta.
> Kernel implementation of the map doesn't need to hold type and
> max_entries as actual configuration fields.
> The map definition in c++ would have looked like:
> bpf::hash_map<int, struct my_value, 1000, NO_PREALLOC> foo;
> bpf::array_map<struct my_value, 2000> bar;
> 
> Sometime key is not necessary. Sometimes flags have to be zero.
> bpf syscall api is a superset of all fiels for all maps.
> All of them are configuration and meta fields at the same time.
> In c++ example there is really no difference between
> 'struct my_value' and '1000' attributes.
> 
> I'm pretty sure bpf will have C++ front-end in the future,
> but until then we have to deal with C and, I think, the map
> definition should be the most natural C syntax.
> In that sense what you're proposing with extern:
> > extern struct my_key my_key;
> > extern int type_int;
> > 
> > struct map_def {
> >      int type;
> >      int max_entries;
> >      void *btf_key_ref;
> >      void *btf_val_ref;
> > } = {
> >      ...
> >      .btf_key_ref = &my_key,
> >      .btf_val_ref = &type_int,
> > };  
> 
> is worse than
> 
> struct map_def {
>        int type;
>        int max_entries;
>        int btf_key;
>        struct my_key btf_value;
> };
> 
> imo explicit key and value would be ideal,
> but they take too much space. Hence pointers
> or zero sized array:
> struct {
>       int type;
>       int max_entries;
>       struct {
>         __u32 key;
>         struct my_value value;
>       } types[];
> };

It is a C syntax problem, I do agree with you that it works well for
templates.  The map_def structure holds parameters, and we can't take
a type as a value in C.  Hence the types[] in your proposal - you could
as well call them ghost_fields[] :)

> I think we should also consider explicit map creation.
> 
> Something like:
> 
> struct my_map {
>    __u32 key;
>    struct my_value value;
> } *my_hash_map, *my_pinned_hash_map;
> 
> struct {
>     __u64 key;
>    struct my_map *value;
> } *my_hash_of_maps;
> 
> struct {
>    struct my_map *value;
> } *my_array_of_maps;
> 
> __init void create_my_maps(void)
> {
>    bpf_create_hash_map(&my_hash_map, 1000/*max_entries*/);
>    bpf_obj_get(&my_pinned_hash_map, "/sys/fs/bpf/my_map");
>    bpf_create_hash_of_maps(&my_hash_of_maps, 1000/*max_entries*/);
>    bpf_create_array_of_maps(&my_array_of_maps, 20);
> }
> 
> SEC("cgroup/skb")
> int bpf_prog(struct __sk_buff *skb)
> {
>    struct my_value *val;
>    __u32 key;
>    __u64 key64;
>    struct my_map *map;
> 
>    val = bpf_map_lookup(my_hash_map, &key);
>    map = bpf_map_lookup(my_hash_of_maps, &key64);
> }
> 
> '__init' section will be compiled by llvm into bpf instructions
> that will be executed in users space by libbpf.
> The __init prog has to succeed otherwise prog load fails.
> 
> May be all map pointers should be in a special section to avoid
> putting them into datasec, but libbpf should be able to figure that
> out without requiring user to specify the .map section.
> The rest of global vars would go into special datasec map.
> 
> No llvm changes necessary and BTF is available for keys and values.
> 
> libbpf can start with simple __init and eventually grow into
> complex init procedure where maps are initialized,
> prog_array is populated, etc.
> 
> Thoughts?

I like it! :)