lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tv2e10ly.fsf@toke.dk>
Date:   Tue, 24 Mar 2020 11:57:45 +0100
From:   Toke Høiland-Jørgensen <toke@...hat.com>
To:     Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc:     John Fastabend <john.fastabend@...il.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        Andrii Nakryiko <andriin@...com>,
        "David S. Miller" <davem@...emloft.net>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Lorenz Bauer <lmb@...udflare.com>,
        Andrey Ignatov <rdna@...com>,
        Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH bpf-next 1/4] xdp: Support specifying expected existing program when attaching XDP

Andrii Nakryiko <andrii.nakryiko@...il.com> writes:

> On Mon, Mar 23, 2020 at 12:23 PM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>>
>> Andrii Nakryiko <andrii.nakryiko@...il.com> writes:
>>
>> > On Mon, Mar 23, 2020 at 4:24 AM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>> >>
>> >> Andrii Nakryiko <andrii.nakryiko@...il.com> writes:
>> >>
>> >> > On Fri, Mar 20, 2020 at 11:31 AM John Fastabend
>> >> > <john.fastabend@...il.com> wrote:
>> >> >>
>> >> >> Jakub Kicinski wrote:
>> >> >> > On Fri, 20 Mar 2020 09:48:10 +0100 Toke Høiland-Jørgensen wrote:
>> >> >> > > Jakub Kicinski <kuba@...nel.org> writes:
>> >> >> > > > On Thu, 19 Mar 2020 14:13:13 +0100 Toke Høiland-Jørgensen wrote:
>> >> >> > > >> From: Toke Høiland-Jørgensen <toke@...hat.com>
>> >> >> > > >>
>> >> >> > > >> While it is currently possible for userspace to specify that an existing
>> >> >> > > >> XDP program should not be replaced when attaching to an interface, there is
>> >> >> > > >> no mechanism to safely replace a specific XDP program with another.
>> >> >> > > >>
>> >> >> > > >> This patch adds a new netlink attribute, IFLA_XDP_EXPECTED_FD, which can be
>> >> >> > > >> set along with IFLA_XDP_FD. If set, the kernel will check that the program
>> >> >> > > >> currently loaded on the interface matches the expected one, and fail the
>> >> >> > > >> operation if it does not. This corresponds to a 'cmpxchg' memory operation.
>> >> >> > > >>
>> >> >> > > >> A new companion flag, XDP_FLAGS_EXPECT_FD, is also added to explicitly
>> >> >> > > >> request checking of the EXPECTED_FD attribute. This is needed for userspace
>> >> >> > > >> to discover whether the kernel supports the new attribute.
>> >> >> > > >>
>> >> >> > > >> Signed-off-by: Toke Høiland-Jørgensen <toke@...hat.com>
>> >> >> > > >
>> >> >> > > > I didn't know we wanted to go ahead with this...
>> >> >> > >
>> >> >> > > Well, I'm aware of the bpf_link discussion, obviously. Not sure what's
>> >> >> > > happening with that, though. So since this is a straight-forward
>> >> >> > > extension of the existing API, that doesn't carry a high implementation
>> >> >> > > cost, I figured I'd just go ahead with this. Doesn't mean we can't have
>> >> >> > > something similar in bpf_link as well, of course.
>> >> >> >
>> >> >> > I'm not really in the loop, but from what I overheard - I think the
>> >> >> > bpf_link may be targeting something non-networking first.
>> >> >>
>> >> >> My preference is to avoid building two different APIs one for XDP and another
>> >> >> for everything else. If we have userlands that already understand links and
>> >> >> pinning support is on the way imo lets use these APIs for networking as well.
>> >> >
>> >> > I agree here. And yes, I've been working on extending bpf_link into
>> >> > cgroup and then to XDP. We are still discussing some cgroup-specific
>> >> > details, but the patch is ready. I'm going to post it as an RFC to get
>> >> > the discussion started, before we do this for XDP.
>> >>
>> >> Well, my reason for being skeptic about bpf_link and proposing the
>> >> netlink-based API is actually exactly this, but in reverse: With
>> >> bpf_link we will be in the situation that everything related to a netdev
>> >> is configured over netlink *except* XDP.
>> >
>> > One can argue that everything related to use of BPF is going to be
>> > uniform and done through BPF syscall? Given variety of possible BPF
>> > hooks/targets, using custom ways to attach for all those many cases is
>> > really bad as well, so having a unifying concept and single entry to
>> > do this is good, no?
>>
>> Well, it depends on how you view the BPF subsystem's relation to the
>> rest of the kernel, I suppose. I tend to view it as a subsystem that
>> provides a bunch of functionality, which you can setup (using "internal"
>> BPF APIs), and then attach that object to a different subsystem
>> (networking) using that subsystem's configuration APIs.
>>
>> Seeing as this really boils down to a matter of taste, though, I'm not
>> sure we'll find agreement on this :)
>
> Yeah, seems like so. But then again, your view and reality don't seem
> to correlate completely. cgroup, a lot of tracing,
> flow_dissector/lirc_mode2 attachments all are done through BPF
> syscall.

Well, I wasn't talking about any of those subsystems, I was talking
about networking :)

In particular, networking already has a consistent and fairly
well-designed configuration mechanism (i.e., netlink) that we are
generally trying to move more functionality *towards* not *away from*
(see, e.g., converting ethtool to use netlink).

> LINK_CREATE provides an opportunity to finally unify all those
> different ways to achieve the same "attach my BPF program to some
> target object" semantics.

Well I also happen to think that "attach a BPF program to an object" is
the wrong way to think about XDP. Rather, in my mind the model is
"instruct the netdevice to execute this piece of BPF code".

>> >> Other than that, I don't see any reason why the bpf_link API won't work.
>> >> So I guess that if no one else has any problem with BPF insisting on
>> >> being a special snowflake, I guess I can live with it as well... *shrugs* :)
>> >
>> > Apart from derogatory remark,
>>
>> Yeah, should have left out the 'snowflake' bit, sorry about that...
>>
>> > BPF is a bit special here, because it requires every potential BPF
>> > hook (be it cgroups, xdp, perf_event, etc) to be aware of BPF
>> > program(s) and execute them with special macro. So like it or not, it
>> > is special and each driver supporting BPF needs to implement this BPF
>> > wiring.
>>
>> All that is about internal implementation, though. I'm bothered by the
>> API discrepancy (i.e., from the user PoV we'll end up with: "netlink is
>> what you use to configure your netdev except if you want to attach an
>> XDP program to it").
>>
>
> See my reply to David. Depends on where you define user API. Is it
> libbpf API, which is what most users are using? Or kernel API?

Well I'm talking about the kernel<->userspace API, obviously :)

> If everyone is using libbpf, does kernel system (bpf syscall vs
> netlink) matter all that much?

This argument works the other way as well, though: If libbpf can
abstract the subsystem differences and provide a consistent interface to
"the BPF world", why does BPF need to impose its own syscall API on the
networking subsystem?

-Toke

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ