netdev - Re: [PATCH v2 bpf-next 00/11] BTF-defined BPF map definitions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzYtM-41Y5w0-6kWhaeQUh7it5Zd_OZBj4kvMYou3TwQ+A@mail.gmail.com>
Date:   Fri, 21 Jun 2019 10:56:47 -0700
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Lorenz Bauer <lmb@...udflare.com>
Cc:     Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andriin@...com>,
        Alexei Starovoitov <ast@...com>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Kernel Team <kernel-team@...com>,
        Jakub Kicinski <jakub.kicinski@...ronome.com>,
        Joe Stringer <joe@...d.net.nz>
Subject: Re: [PATCH v2 bpf-next 00/11] BTF-defined BPF map definitions

On Fri, Jun 21, 2019 at 3:29 AM Lorenz Bauer <lmb@...udflare.com> wrote:
>
> On Fri, 21 Jun 2019 at 05:20, Andrii Nakryiko <andrii.nakryiko@...il.com> wrote:
> >
> > On Thu, Jun 20, 2019 at 7:49 AM Lorenz Bauer <lmb@...udflare.com> wrote:
> > >
> > > On Tue, 18 Jun 2019 at 22:37, Andrii Nakryiko <andrii.nakryiko@...il.com> wrote:
> > >
> > > > > I would just drop the object-scope pinning. We avoided using it and I'm not
> > > > > aware if anyone else make use. It also has the ugly side-effect that this
> > > > > relies on AF_ALG which e.g. on some cloud provider shipped kernels is disabled.
> > > > > The pinning attribute should be part of the standard set of map attributes for
> > > > > libbpf though as it's generally useful for networking applications.
> > > >
> > > > Sounds good. I'll do some more surveying of use cases inside FB to see
> > > > if anyone needs object-scope pinning, just to be sure we are not
> > > > short-cutting anyone.
> > >
> > > I'm also curious what the use cases for declarative pinning are. From my
> > > limited POV it doesn't seem that useful? There are a couple of factors:
> >
> > Cilium is using it pretty extensively, so there are clearly use cases.
> > The most straigtforward use case is using a map created and shared by
> > another BPF program (to communicate, read stats, what have you).
>
> I think Cilium is in the quirky position that it has a persistent daemon, but
> shells out to tc for loading programs. They are probably also the most
> advanced (open-source) users of BPF out there. If I understood their comments
> correctly they want to move to using a library for loading their ELF. At that
> point whether something is possible in a declarative way is less important,
> because you have the much more powerful APIs at your disposal.
>
> Maybe Daniel or someone else from the Cilium team can chime in here?

Yep, curious about their perspective on that.

>
> > > * Systemd mounts the default location only accessible to root, so I have to
> > >   used my own bpffs mount.
> > > * Since I don't want to hard code that, I put it in a config file.
> > > * After loading the ELF we pin maps from the daemon managing the XDP.
> >
> > So mounting root would be specified per bpf_object, before maps are
> > created, so user-land driving application will have an opportunity to
> > tune everything. Declarative is only the per-map decision of whether
> > that map should be exposed to outer world (for sharing) or not.
>
> So `tc filter add bpf obj foo.elf pin-root /gobbledygook`?

I meant something like:

bpf_object_open_attr attr;
attr.file = "path/to/my/object.o";
attr.pin_root_path = "/my/fancy/bpffs/root";
bpf_object__open_xattr(&attr);

Then tools can adopt they when necessary.

>
> > Then check tools/testing/selftests/bpf/progs/btf_dump_test_case_syntax.c
> > for more crazy syntax ;)
> >
> > typedef char * (* const (* const fn_ptr_arr2_t[5])())(char * (*)(int));
>
> Not on a Friday ;P
>
> > > What if this did
> > >
> > >   __type(value, struct my_value)[1000];
> > >   struct my_value __member(value)[1000]; // alternative
> > >
> > > instead, and skipped max_entries?
> >
> > I considered that, but decided for now to keep all those attributes
> > orthogonal for more flexibility and uniformity. This syntax might be
> > considered a nice "syntax sugar" and can be added in the future, if
> > necessary.
>
> Ack.
>
> > > At that point you have to understand that value is a pointer so all of
> > > our efforts
> > > are for naught. I suspect there is other weirdness like this, but I need to play
> > > with it a little bit more.
> >
> > Yes, C can let you do crazy stuff, if you wish, but I think that
> > shouldn't be a blocker for this proposal. I haven't seen any BPF
> > program doing that, usually you duplicate the type of inner value
> > inside your function anyway, so there is no point in taking
> > sizeof(map.value) from BPF program side. From outside, though, all the
> > types will make sense, as expected.
>
> Right, but in my mind that is a bit of a cop out. I like BTF map definitions,
> and I want them to be as unsurprising as possible, so that they are
> easy to use and adopt.


Right, but there are limit on what you can do with C syntax and it's
type system. Having fancy extra features like you described (e.g,
sizeof(map.value), etc) is pretty low on a priority list.

>
> If a type encodes all the information we need via the array dimension hack,
> couldn't we make the map variable itself a pointer, and drop the inner pointers?
>
> struct my_map_def {
>   int type[BPF_MAP_TYPE_HASH];
>   int value;
>   struct foo key;

This is bad because it potentially uses lots of space. If `struct foo`
is big, if max_entries is big, even for type, it's still a bunch of
extra space wasted. That's why we have pointers everywhere, as they
allow to encode everything with fixed space overhead of 8 bytes for a
pointer.


>   ...
> }
>
> struct my_map_def *my_map;
>
> --
> Lorenz Bauer  |  Systems Engineer
> 6th Floor, County Hall/The Riverside Building, SE1 7PB, UK
>
> www.cloudflare.com