[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877fcboczo.fsf@x220.int.ebiederm.org>
Date: Sat, 23 Jul 2016 23:51:07 -0500
From: ebiederm@...ssion.com (Eric W. Biederman)
To: "W. Trevor King" <wking@...mily.us>
Cc: James Bottomley <James.Bottomley@...senPartnership.com>,
Andrey Vagin <avagin@...nvz.org>,
Serge Hallyn <serge.hallyn@...onical.com>,
linux-api@...r.kernel.org, containers@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org,
Alexander Viro <viro@...iv.linux.org.uk>, criu@...nvz.org,
linux-fsdevel@...r.kernel.org,
"Michael Kerrisk \(man-pages\)" <mtk.manpages@...il.com>
Subject: Re: [PATCH 0/5 RFC] Add an interface to discover relationships between namespaces
"W. Trevor King" <wking@...mily.us> writes:
> On Sat, Jul 23, 2016 at 04:56:44PM -0500, Eric W. Biederman wrote:
>> "W. Trevor King" <wking@...mily.us> writes:
>> > On Sat, Jul 23, 2016 at 02:38:56PM -0700, James Bottomley wrote:
>> >> On Sat, 2016-07-23 at 14:14 -0700, W. Trevor King wrote:
>> >> > namespaces(7) and clone(2) both have:
>> >> >
>> >> > When a network namespace is freed (i.e., when the last
>> >> > process in the namespace terminates), its physical network
>> >> > devices are moved back to the initial network namespace (not
>> >> > to the parent of the process).
>> >> >
>> >> > So the initial network namespace (the head of
>> >> > net_namespace_list?) is special [1]. To understand how
>> >> > physical network devices will be handled, it seems like we want
>> >> > to treat network devices as a depth-1 tree, with all
>> >> > non-initial net namespaces as children of the initial net
>> >> > namespace. Can we extend this series' NS_GET_PARENT to return:
>> >> >
>> >> > * EPERM for an unprivileged caller (like this series currently
>> >> > does for PID namespaces),
>> >> > * ENOENT when called on net_namespace_list, and
>> >> > * net_namespace_list when called on any other net namespace.
>> >>
>> >> What's the practical application of this? independent net
>> >> namespaces are managed by the ip netns command. It pins them by
>> >> a bind mount in a flat fashion; if we make them hierarchical the
>> >> tool would probably need updating to reflect this, so we're going
>> >> to need a reason to give the network people. Just having the
>> >> interfaces not go back to root when you do an ip netns delete
>> >> doesn't seem very compelling.
>> >
>> > I'm not suggesting we add support for deeper nesting, I'm suggesting
>> > we use NS_GET_PARENT to allow sufficiently privileged users to
>> > determine if a given net namespace is the initial net namespace. You
>> > could do this already with something like:
>> >
>> > 1. Create a new net namespace.
>> > 2. Add a physical network device to that namespace.
>> > 3. Delete that namespace.
>> > 4. See if the physical network device shows up in your
>> > initial-net-namespace candidate.
>> > 5. Delete the physical network device (hopefully it ended up
>> > somewhere you can find it ;).
>> >
>> > But using an NS_GET_PARENT call seems much safer and easier.
>>
>> Have you had the problem in practice where you can't tell which
>> network namespace is the initial network namespace. This all seems
>> like a theoretical problem rather than a real one.
>
> I haven't had any practical problems here, I'm just trying to wrap my
> head around namespace-relationship discovery. The special physical
> network device handling seems a lot like init re-parenting (with no
> PR_SET_CHILD_SUBREAPER analog in a 1-deep namespace tree), so calling
> the initial network namespace a parent (and all the other namespaces
> its direct children) seems natural enough. If that doesn't sound
> convincing, I'm happy to punt this idea until someone runs into a
> practical problem ;).
Then let's punt this until someone runs into a practical problem.
For scaling and for sanity it is desirable to keep the connections
between namespaces to a minimum. Further the initial instances of a
namespace always tend to be a little bit special.
Eric
Powered by blists - more mailing lists