[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ppei45ig.fsf@x220.int.ebiederm.org>
Date: Fri, 26 Sep 2014 11:10:31 -0700
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Nicolas Dichtel <nicolas.dichtel@...nd.com>
Cc: netdev@...r.kernel.org, containers@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, linux-api@...r.kernel.org,
davem@...emloft.net, stephen@...workplumber.org,
akpm@...ux-foundation.org, luto@...capital.net,
Cong Wang <cwang@...pensource.com>
Subject: Re: [RFC PATCH net-next v2 0/5] netns: allow to identify peer netns
Nicolas Dichtel <nicolas.dichtel@...nd.com> writes:
> The goal of this serie is to be able to multicast netlink messages with an
> attribute that identify a peer netns.
> This is needed by the userland to interpret some informations contained in
> netlink messages (like IFLA_LINK value, but also some other attributes in case
> of x-netns netdevice (see also
> http://thread.gmane.org/gmane.linux.network/315933/focus=316064 and
> http://thread.gmane.org/gmane.linux.kernel.containers/28301/focus=4239)).
I want say that the problem addressed by patch 3/5 of this series is a
fundamentally valid problem. We have network objects spanning network
namespaces and it would be very nice to be able to talk about them in
netlink, and file descriptors are too local and argubably too heavy
weight for netlink quires and especially for netlink broadcast messages.
Furthermore the concept of ineternal concept of peernet2id seems valid.
However what you do not address is a way for CRIU (aka process
migration) to be able to restore these ids after process migration.
Going farther it looks like you are actively breaking process migration
at this time, making this set of patches a no-go.
When adding a new form of namespace id CRIU patches are just about
as necessary as iproute patches.
> Ids are stored in the parent user namespace. These ids are valid only inside
> this user namespace. The user can retrieve these ids via a new netlink messages,
> but only if peer netns are in the same user namespace.
That does not describe what you have actually implemented in the
patches.
I see two ways to go with this.
- A per network namespace table to that you can store ids for ``peer''
network namespaces. The table would need to be populated manually by
the likes of ip netns add.
That flips the order of assignment and makes this idea solid.
Unfortunately in the case of a fully referencing mesh of N network
namespaces such a mesh winds up taking O(N^2) space, which seems
undesirable.
- Add a netlink attribute that says this network element is in a peer
network namespace.
Add a unicast query message that let's you ask if the remote
end of a tunnel is in a network namespace specified by file
descriptor.
I personally lean towards the second version as it is fundamentally
simpler, and generally scales better, and the visibility controls are
the existing visibility controls. The only downside is it requires
a query after receiving a netlink broadcast message for the times that
we care.
Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists