[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <473a3e8a-03ea-636c-f054-3c960bf0fdbd@iogearbox.net>
Date: Thu, 5 Mar 2020 23:34:18 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Toke Høiland-Jørgensen <toke@...hat.com>
Cc: Alexei Starovoitov <ast@...com>,
Andrii Nakryiko <andrii.nakryiko@...il.com>,
Andrii Nakryiko <andriin@...com>, bpf <bpf@...r.kernel.org>,
Networking <netdev@...r.kernel.org>,
Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next 0/3] Introduce pinnable bpf_link kernel
abstraction
On 3/5/20 5:34 PM, Alexei Starovoitov wrote:
> On Thu, Mar 05, 2020 at 11:37:11AM +0100, Toke Høiland-Jørgensen wrote:
>> Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
>>> On Wed, Mar 04, 2020 at 08:47:44AM +0100, Toke Høiland-Jørgensen wrote:
[...]
>> Anyway, what I was trying to express:
>>
>>> Still that doesn't mean that pinned link is 'immutable'.
>>
>> I don't mean 'immutable' in the sense that it cannot be removed ever.
>> Just that we may end up in a situation where an application can see a
>> netdev with an XDP program attached, has the right privileges to modify
>> it, but can't because it can't find the pinned bpf_link. Right? Or am I
>> misunderstanding your proposal?
>>
>> Amending my example from before, this could happen by:
>>
>> 1. Someone attaches a program to eth0, and pins the bpf_link to
>> /sys/fs/bpf/myprog
>>
>> 2. eth0 is moved to a different namespace which mounts a new sysfs at
>> /sys
>>
>> 3. Inside that namespace, /sys/fs/bpf/myprog is no longer accessible, so
>> xdp-loader can't get access to the original bpf_link; but the XDP
>> program is still attached to eth0.
>
> The key to decide is whether moving netdev across netns should be allowed
> when xdp attached. I think it should be denied. Even when legacy xdp
> program is attached, since it will confuse user space managing part.
There are perfectly valid use cases where this is done already today (minus
bpf_link), for example, consider an orchestrator that is setting up the BPF
program on the device, moving to the newly created application pod during
the CNI call in k8s, such that the new pod does not have the /sys/fs/bpf/
mount instance and if unprivileged cannot remove the BPF prog from the dev
either. We do something like this in case of ipvlan, meaning, we attach a
rootlet prog that calls into single slot of a tail call map, move it to the
application pod, and only out of Cilium's own pod and it's pod-local bpf fs
instance we manage the pinned tail call map to update the main programs in
that single slot w/o having to switch any netns later on.
Thanks,
Daniel
Powered by blists - more mailing lists