[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20191003175848.GE3223377@mini-arch>
Date: Thu, 3 Oct 2019 10:58:48 -0700
From: Stanislav Fomichev <sdf@...ichev.me>
To: John Fastabend <john.fastabend@...il.com>
Cc: Andrii Nakryiko <andrii.nakryiko@...il.com>,
Stanislav Fomichev <sdf@...gle.com>,
Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Petar Penkov <ppenkov@...gle.com>
Subject: Re: [PATCH bpf-next 1/2] bpf/flow_dissector: add mode to enforce
global BPF flow dissector
On 10/03, John Fastabend wrote:
> Andrii Nakryiko wrote:
> > On Thu, Oct 3, 2019 at 9:01 AM Stanislav Fomichev <sdf@...ichev.me> wrote:
> > >
> > > On 10/02, Andrii Nakryiko wrote:
> > > > On Wed, Oct 2, 2019 at 6:43 PM Stanislav Fomichev <sdf@...ichev.me> wrote:
> > > > >
> > > > > On 10/02, Andrii Nakryiko wrote:
> > > > > > On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@...gle.com> wrote:
> > > > > > >
> > > > > > > Always use init_net flow dissector BPF program if it's attached and fall
> > > > > > > back to the per-net namespace one. Also, deny installing new programs if
> > > > > > > there is already one attached to the root namespace.
> > > > > > > Users can still detach their BPF programs, but can't attach any
> > > > > > > new ones (-EPERM).
> > > >
> > > > I find this quite confusing for users, honestly. If there is no root
> > > > namespace dissector we'll successfully attach per-net ones and they
> > > > will be working fine. That some process will attach root one and all
> > > > the previously successfully working ones will suddenly "break" without
> > > > users potentially not realizing why. I bet this will be hair-pulling
> > > > investigation for someone. Furthermore, if root net dissector is
> > > > already attached, all subsequent attachment will now start failing.
> > > The idea is that if sysadmin decides to use system-wide dissector it would
> > > be attached from the init scripts/systemd early in the boot process.
> > > So the users in your example would always get EPERM/EBUSY/EXIST.
> > > I don't really see a realistic use-case where root and non-root
> > > namespaces attach/detach flow dissector programs at non-boot
> > > time (or why non-root containers could have BPF dissector and root
> > > could have C dissector; multi-nic machine?).
> > >
> > > But I totally see your point about confusion. See below.
> > >
> > > > I'm not sure what's the better behavior here is, but maybe at least
> > > > forcibly detach already attached ones, so when someone goes and tries
> > > > to investigate, they will see that their BPF program is not attached
> > > > anymore. Printing dmesg warning would be hugely useful here as well.
> > > We can do for_each_net and detach non-root ones; that sounds
> > > feasible and may avoid the confusion (at least when you query
> > > non-root ns to see if the prog is still there, you get a valid
> > > indication that it's not).
> > >
> > > > Alternatively, if there is any per-net dissector attached, we might
> > > > disallow root net dissector to be installed. Sort of "too late to the
> > > > party" way, but at least not surprising to successfully installed
> > > > dissectors.
> > > We can do this as well.
> > >
> > > > Thoughts?
> > > Let me try to implement both of your suggestions and see which one makes
> > > more sense. I'm leaning towards the later (simple check to see if
> > > any non-root ns has the prog attached).
> > >
> > > I'll follow up with a v2 if all goes well.
> >
> > Thanks! I don't have strong opinion on either, see what makes most
> > sense from an actual user perspective.
>
>
> From my point of view the second option is better. The root namespace flow
> dissector attach should always happen first before any other namespaces are
> created. If any namespaces have already attached then just fail the root
> namespace.
>
> Otherwise if you detach existing dissectors from a container these were
> probably attached by the init container which might not be running anymore
> and I have no easy way to learn/find out about this without creating another
> container specifically to watch for this. If I'm relying on the dissector
> for something now I can seemingly random errors. So its a bit ugly and I'll
> probably just tell users to always attach the root namespace first to avoid
> this headache. On the other side if the root namespace already has a
> flow dissector attached and my init container fails its attach cmd I
> can handle the error gracefully or even fail to launch the container with
> a nice error message and the administrator can figure something out.
> I'm always in favor of hard errors vs trying to guess what the right
> choice is for any particular setup.
>
> Also it seems to me just checking if anything is attached is going to make
> the code simpler vs trying to detach things in all namespaces.
Agreed, I was also leaning towards this option. Thanks!
Powered by blists - more mailing lists