[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a744peij.fsf@toke.dk>
Date: Wed, 25 Mar 2020 17:48:04 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
Andrii Nakryiko <andriin@...com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Jesper Dangaard Brouer <brouer@...hat.com>,
John Fastabend <john.fastabend@...il.com>,
Lorenz Bauer <lmb@...udflare.com>, Andrey Ignatov <rdna@...com>
Subject: Re: [PATCH bpf-next v3 1/4] xdp: Support specifying expected existing program when attaching XDP
Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
> On Tue, Mar 24, 2020 at 07:12:53PM +0100, Toke Høiland-Jørgensen wrote:
>> From: Toke Høiland-Jørgensen <toke@...hat.com>
>>
>> While it is currently possible for userspace to specify that an existing
>> XDP program should not be replaced when attaching to an interface, there is
>> no mechanism to safely replace a specific XDP program with another.
>>
>> This patch adds a new netlink attribute, IFLA_XDP_EXPECTED_ID, which can be
>> set along with IFLA_XDP_FD. If set, the kernel will check that the program
>> currently loaded on the interface matches the expected one, and fail the
>> operation if it does not. This corresponds to a 'cmpxchg' memory operation.
>> Setting the new attribute with a negative value means that no program is
>> expected to be attached, which corresponds to setting the UPDATE_IF_NOEXIST
>> flag.
>>
>> A new companion flag, XDP_FLAGS_EXPECT_ID, is also added to explicitly
>> request checking of the EXPECTED_ID attribute. This is needed for userspace
>> to discover whether the kernel supports the new attribute.
>>
>> Signed-off-by: Toke Høiland-Jørgensen <toke@...hat.com>
>
> Over the years of defining apis to attach bpf progs to different kernel
> components the key mistake we made is that we tried to use corresponding
> subsystem way of doing thing and it failed every time. What made the problem
> worse that this failure we only learned after many years. Attaching xdp via
> netlink felt great back then. Attaching clsbpf via tc felt awesome too. Doing
> cgroup-bpf via bpf syscall with three different attaching ways also felt great.
> All of these attaching things turned out to be broken because all of them
> failed to provide the concept of ownership of the attachment. Which caused 10k
> clsbpf progs on one netdev, 64 cgroup-bpf progs in one cgroup, nuked xdp progs.
> Hence we have to add the ownership model first. Doing mini extensions to
> existing apis is not addressing the giant issue of existing apis.
>
> The idea of this patch is to do atomic replacement of xdp prog. It's a good
> idea and useful feature, but I don't want to extend existing broken apis to do
> add this feature. atomic replacement has to come with thought through owner
> model first.
Setting aside the question of which is the best abstraction to represent
an attachment, it seems to me that the actual behavioural problem (XDP
programs being overridden by mistake) would be solvable by this patch,
assuming well-behaved userspace applications.
If you do need kernel support, we could extend the netlink API to accept
bpf_link FDs in addition to prog FDs. Then applications that fit the
bpf_link model could use netlink to setup the initial link (and any
other device configuration they need to do anyway), and any subsequent
replacements would be done by LINK_UPDATE bpf() operations. The lack of
CAP_NET_ADMIN could even restrict applications from removing the
bpf_link attachment from the netdev, without preventing them from
swapping out the bpf_prog attached to it.
This would also keep netlink solely in charge of netdev configuration,
and prevent any issues with a netdev being locked due to a bpf_link
attachment.
-Toke
Powered by blists - more mailing lists