[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aeae7b94-090a-a850-4740-0274ab8178d5@solarflare.com>
Date: Tue, 22 Oct 2019 18:27:26 +0100
From: Edward Cree <ecree@...arflare.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>,
"John Fastabend" <john.fastabend@...il.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>
CC: Daniel Borkmann <daniel@...earbox.net>,
Alexei Starovoitov <ast@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
Marek Majkowski <marek@...udflare.com>,
Lorenz Bauer <lmb@...udflare.com>,
Alan Maguire <alan.maguire@...cle.com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
"David Miller" <davem@...emloft.net>, <netdev@...r.kernel.org>,
<bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF
programs after each other
On 17/10/2019 13:11, Toke Høiland-Jørgensen wrote:
> I think there's a conceptual disconnect here in how we view what an XDP
> program is. In my mind, an XDP program is a stand-alone entity tied to a
> particular application; not a library function that can just be inserted
> into another program.
To me, an XDP (or any other eBPF) program is a function that is already
being 'inserted into another program', namely, the kernel. It's a
function that's being wired up to a hook in the kernel. Which isn't
so different to wiring it up to a hook in a function that's wired up to
a hook in the kernel (which is what my proposal effectively does).
> Setting aside that for a moment; the reason I don't think this belongs
> in userspace is that putting it there would carry a complexity cost that
> is higher than having it in the kernel.
Complexity in the kernel is more expensive than in userland. There are
several reasons for this, such as:
* The kernel's reliability requirements are stricter — a daemon that
crashes can be restarted, a kernel that crashes ruins your day.
* Userland has libraries available for many common tasks that can't be
used in the kernel.
* Anything ABI-visible (which this would be) has to be kept forever even
if it turns out to be a Bad Idea™, because We Do Not Break Userspace™.
The last of these is the big one, and means that wherever possible the
proper course is to prototype functionality in userspace, and then once
the ABI is solid and known-useful, it can move to the kernel if there's
an advantage to doing so (typically performance). Yes, that means
applications may have to change twice (though hopefully just a matter
of building against a new libbpf), but the old applications can be kept
working (by keeping the daemon around on such systems).
> Specifically, if we do implement
> an 'xdpd' daemon to handle all this, that would mean that we:
>
> - Introduce a new, separate code base that we'll have to write, support
> and manage updates to.
Separation is a good thing. Whichever way we do this, we have to write
some new code. Having that code _outside_ the kernel tree helps to keep
our layers separate. Chain calling is a layering violation!
> - Add a new dependency to using XDP (now you not only need the kernel
> and libraries, you'll also need the daemon).
You'll need *a* daemon. You won't be tied to a specific implementation.
And if you're just developing, you won't even need that — you can still
bind a prog directly to the device if you have the ackles — so it's
only for application deployment that it's needed. By the time you're
at the point of deploying an application that people are going to be
installing with "yum install myFirewall", you have the whole package
manager dependency resolution system to deal with the daemon.
> - Have to duplicate or wrap functionality currently found in the kernel;
> at least:
>
> - Keeping track of which XDP programs are loaded and attached to
> each interface
There's already an API to query this. You would probably want an atomic
cmpxchg operation, so that you can detect if someone else is fiddling
with XDP and scream noisy warnings.
> (as well as the "new state" of their attachment order).
That won't be duplicate, because it won't be in the kernel. The kernel
will only ever see one blob and it doesn't know or care how userland
assembled it.
> - Some kind of interface with the verifier; if an app does
> xdpd_rpc_load(prog), how is the verifier result going to get back
> to the caller?
The daemon will get the verifier log back when it tries to update the
program; it might want to do a bit of translation before passing it on,
but an RPC call can definitely return errors to the caller.
In the Ideal World of kernel dynamic linking, of course, each app prog
gets submitted to the verifier by the app to create a floating function
in the kernel that's not bound to any XDP hook (app gets its verifier
responses at this point) and then the app just sends an fd for that
function to the daemon; at that point any verifier errors after linking
are the fault of the daemon and its master program. Thus the Ideal
World doesn't need any kind of translation of verifier output to make
it match up with individual app's program.
> - Have to deal with state synchronisation issues (how does xdpd handle
> kernel state changing from underneath it?).
The cmpxchg I mentioned above would help with that.
> While these are issues that are (probably) all solvable, I think the
> cost of solving them is far higher than putting the support into the
> kernel. Which is why I think kernel support is the best solution :)
See my remarks above about kernel ABIs.
Also, chain calling and the synchronisation dance between apps still
looks needlessly complex and fragile to me — it's like you're having
the kernel there to be the central point of control and then not
actually having a central point of control after all. (But if chain
calling does turn out to be the right API, well, the daemon can
always implement that!)
-Ed
Powered by blists - more mailing lists