netdev - Re: [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF programs after each other

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aeae7b94-090a-a850-4740-0274ab8178d5@solarflare.com>
Date:   Tue, 22 Oct 2019 18:27:26 +0100
From:   Edward Cree <ecree@...arflare.com>
To:     Toke Høiland-Jørgensen <toke@...hat.com>,
        "John Fastabend" <john.fastabend@...il.com>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>
CC:     Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        Marek Majkowski <marek@...udflare.com>,
        Lorenz Bauer <lmb@...udflare.com>,
        Alan Maguire <alan.maguire@...cle.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        "David Miller" <davem@...emloft.net>, <netdev@...r.kernel.org>,
        <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF
 programs after each other

On 17/10/2019 13:11, Toke Høiland-Jørgensen wrote:
> I think there's a conceptual disconnect here in how we view what an XDP
> program is. In my mind, an XDP program is a stand-alone entity tied to a
> particular application; not a library function that can just be inserted
> into another program.
To me, an XDP (or any other eBPF) program is a function that is already
 being 'inserted into another program', namely, the kernel.  It's a
 function that's being wired up to a hook in the kernel.  Which isn't
 so different to wiring it up to a hook in a function that's wired up to
 a hook in the kernel (which is what my proposal effectively does).

> Setting aside that for a moment; the reason I don't think this belongs
> in userspace is that putting it there would carry a complexity cost that
> is higher than having it in the kernel.
Complexity in the kernel is more expensive than in userland.  There are
 several reasons for this, such as:
* The kernel's reliability requirements are stricter — a daemon that
  crashes can be restarted, a kernel that crashes ruins your day.
* Userland has libraries available for many common tasks that can't be
  used in the kernel.
* Anything ABI-visible (which this would be) has to be kept forever even
  if it turns out to be a Bad Idea™, because We Do Not Break Userspace™.
The last of these is the big one, and means that wherever possible the
 proper course is to prototype functionality in userspace, and then once
 the ABI is solid and known-useful, it can move to the kernel if there's
 an advantage to doing so (typically performance).  Yes, that means
 applications may have to change twice (though hopefully just a matter
 of building against a new libbpf), but the old applications can be kept
 working (by keeping the daemon around on such systems).

> Specifically, if we do implement
> an 'xdpd' daemon to handle all this, that would mean that we:
>
> - Introduce a new, separate code base that we'll have to write, support
>   and manage updates to.
Separation is a good thing.  Whichever way we do this, we have to write
 some new code.  Having that code _outside_ the kernel tree helps to keep
 our layers separate.  Chain calling is a layering violation!

> - Add a new dependency to using XDP (now you not only need the kernel
>   and libraries, you'll also need the daemon).
You'll need *a* daemon.  You won't be tied to a specific implementation.
And if you're just developing, you won't even need that — you can still
 bind a prog directly to the device if you have the ackles — so it's
 only for application deployment that it's needed.  By the time you're
 at the point of deploying an application that people are going to be
 installing with "yum install myFirewall", you have the whole package
 manager dependency resolution system to deal with the daemon.

> - Have to duplicate or wrap functionality currently found in the kernel;
>   at least:
>   
>     - Keeping track of which XDP programs are loaded and attached to
>       each interface
There's already an API to query this.  You would probably want an atomic
 cmpxchg operation, so that you can detect if someone else is fiddling
 with XDP and scream noisy warnings.

> (as well as the "new state" of their attachment order).
That won't be duplicate, because it won't be in the kernel.  The kernel
 will only ever see one blob and it doesn't know or care how userland
 assembled it.

>     - Some kind of interface with the verifier; if an app does
>       xdpd_rpc_load(prog), how is the verifier result going to get back
>       to the caller?
The daemon will get the verifier log back when it tries to update the
 program; it might want to do a bit of translation before passing it on,
 but an RPC call can definitely return errors to the caller.
In the Ideal World of kernel dynamic linking, of course, each app prog
 gets submitted to the verifier by the app to create a floating function
 in the kernel that's not bound to any XDP hook (app gets its verifier
 responses at this point) and then the app just sends an fd for that
 function to the daemon; at that point any verifier errors after linking
 are the fault of the daemon and its master program.  Thus the Ideal
 World doesn't need any kind of translation of verifier output to make
 it match up with individual app's program.

> - Have to deal with state synchronisation issues (how does xdpd handle
>   kernel state changing from underneath it?).
The cmpxchg I mentioned above would help with that.

> While these are issues that are (probably) all solvable, I think the
> cost of solving them is far higher than putting the support into the
> kernel. Which is why I think kernel support is the best solution :)
See my remarks above about kernel ABIs.
Also, chain calling and the synchronisation dance between apps still
 looks needlessly complex and fragile to me — it's like you're having
 the kernel there to be the central point of control and then not
 actually having a central point of control after all.  (But if chain
 calling does turn out to be the right API, well, the daemon can
 always implement that!)

-Ed