[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231009182753.851551-1-toke@redhat.com>
Date: Mon, 9 Oct 2023 20:27:48 +0200
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: David Ahern <dsahern@...il.com>,
Stephen Hemminger <stephen@...workplumber.org>
Cc: netdev@...r.kernel.org,
Toke Høiland-Jørgensen <toke@...hat.com>,
Nicolas Dichtel <nicolas.dichtel@...nd.com>,
Christian Brauner <brauner@...nel.org>,
"Eric W . Biederman" <ebiederm@...ssion.com>,
David Laight <David.Laight@...LAB.COM>
Subject: [RFC PATCH iproute2-next 0/5] Persisting of mount namespaces along with network namespaces
The 'ip netns' command is used for setting up network namespaces with persistent
named references, and is integrated into various other commands of iproute2 via
the -n switch.
This is useful both for testing setups and for simple script-based namespacing
but has one drawback: the lack of persistent mounts inside the spawned
namespace. This is particularly apparent when working with BPF programs that use
pinning to bpffs: by default no bpffs is available inside a namespace, and
even if mounting one, that fs disappears as soon as the calling command exits.
The underlying cause for this is that iproute2 will create a new mount namespace
every time it switches into a network namespace. This is needed to be able to
mount a /sys filesystem that shows the correct network device information, but
has the unfortunate side effect of making mounts entirely transient for any 'ip
netns' invocation.
This series is an attempt to fix this situation, by persisting a mount namespace
alongside the persistent network namespace (in a separate directory,
/run/netns-mnt). Doing this allows us to still have a consistent /sys inside
the namespace, but with persistence so any mounts survive.
This mode does come with some caveats. I'm sending this as RFC to get feedback
on whether this is the right thing to do, especially considering backwards
compatibility. On balance, I think that the approach taken here of
unconditionally persisting the mount namespace, and using that persistent
reference whenever it exists, is better than the current behaviour, and that
while it does represent a change in behaviour it is backwards compatible in a
way that won't cause issues. But please do comment on this; see the patch
description of patch 4 for details.
The first three patches are just moving code around and should not represent any
functional changes. The fourth patch introduces the mount namespace persistence,
and the fifth patch adds mounting of a bpffs instance to the mount namespace
preparation logic.
Cc: Nicolas Dichtel <nicolas.dichtel@...nd.com>
Cc: Christian Brauner <brauner@...nel.org>
Cc: Eric W. Biederman <ebiederm@...ssion.com>
Cc: David Laight <David.Laight@...LAB.COM>
Toke Høiland-Jørgensen (5):
ip: Mount netns in child process instead of from inside the new
namespace
ip: Split out code creating namespace mount dir so it can be reused
lib/namespace: Factor out code for reuse
ip: Also create and persist mount namespace when creating netns
lib/namespace: Also mount a bpffs instance inside new mount namespaces
Makefile | 2 +
include/namespace.h | 1 +
ip/ipnetns.c | 357 ++++++++++++++++++++++++++++++--------------
lib/namespace.c | 82 +++++++---
4 files changed, 312 insertions(+), 130 deletions(-)
--
2.42.0
Powered by blists - more mailing lists