[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877cnvtu37.fsf@email.froward.int.ebiederm.org>
Date: Mon, 09 Oct 2023 15:32:44 -0500
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>
Cc: David Ahern <dsahern@...il.com>, Stephen Hemminger
<stephen@...workplumber.org>, netdev@...r.kernel.org, Nicolas Dichtel
<nicolas.dichtel@...nd.com>, Christian Brauner <brauner@...nel.org>,
David Laight <David.Laight@...LAB.COM>
Subject: Re: [RFC PATCH iproute2-next 0/5] Persisting of mount namespaces
along with network namespaces
Toke Høiland-Jørgensen <toke@...hat.com> writes:
> The 'ip netns' command is used for setting up network namespaces with persistent
> named references, and is integrated into various other commands of iproute2 via
> the -n switch.
>
> This is useful both for testing setups and for simple script-based namespacing
> but has one drawback: the lack of persistent mounts inside the spawned
> namespace. This is particularly apparent when working with BPF programs that use
> pinning to bpffs: by default no bpffs is available inside a namespace, and
> even if mounting one, that fs disappears as soon as the calling
> command exits.
It would be entirely reasonable to copy mounts like /sys/fs/bpf from the
original mount namespace into the temporary mount namespace used by
"ip netns".
I would call it a bug that "ip netns" doesn't do that already.
I suspect that "ip netns" does copy the mounts from the old sysfs onto
the new sysfs is your entire problem.
Or is their a reason that bpffs should be per network namespace?
> The underlying cause for this is that iproute2 will create a new mount namespace
> every time it switches into a network namespace. This is needed to be able to
> mount a /sys filesystem that shows the correct network device information, but
> has the unfortunate side effect of making mounts entirely transient for any 'ip
> netns' invocation.
Mount propagation can be made to work if necessary, that would solve the
transient problem.
> This series is an attempt to fix this situation, by persisting a mount namespace
> alongside the persistent network namespace (in a separate directory,
> /run/netns-mnt). Doing this allows us to still have a consistent /sys inside
> the namespace, but with persistence so any mounts survive.
I really don't like that direction.
"ip netns" was designed and really should continue to be a command that
makes the world look like it has a single network namespace, for
compatibility with old code. Part of that old code "ip netns" supports
is "ip" itself.
I think you are making bpffs unnecessarily per network namespace.
> This mode does come with some caveats. I'm sending this as RFC to get feedback
> on whether this is the right thing to do, especially considering backwards
> compatibility. On balance, I think that the approach taken here of
> unconditionally persisting the mount namespace, and using that persistent
> reference whenever it exists, is better than the current behaviour, and that
> while it does represent a change in behaviour it is backwards compatible in a
> way that won't cause issues. But please do comment on this; see the patch
> description of patch 4 for details.
As I understand it this will cause a problem for any application that
is network namespace aware and does not use "ip netns" to wrap itself.
I am fairly certain that pinning the mount namespace will result in
never seeing an update of /etc/resolve.conf. At least if you
are on a system that has /etc/netns/NAME/resolve.conf
Unless I am missing something I think you are trying to solve the wrong
problem. I think all it will take is for the new mount of /sys to have
the same mounts on it as the previous mount of /sys.
Eric
Powered by blists - more mailing lists