[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180418215246.GA24000@gmail.com>
Date: Wed, 18 Apr 2018 23:52:47 +0200
From: Christian Brauner <christian.brauner@...onical.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: davem@...emloft.net, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, avagin@...tuozzo.com,
ktkhai@...tuozzo.com, serge@...lyn.com, gregkh@...uxfoundation.org
Subject: Re: [PATCH net-next 2/2] netns: isolate seqnums to use per-netns
locks
On Wed, Apr 18, 2018 at 11:55:52AM -0500, Eric W. Biederman wrote:
> Christian Brauner <christian.brauner@...ntu.com> writes:
>
> > Now that it's possible to have a different set of uevents in different
> > network namespaces, per-network namespace uevent sequence numbers are
> > introduced. This increases performance as locking is now restricted to the
> > network namespace affected by the uevent rather than locking
> > everything.
>
> Numbers please. I personally expect that the netlink mc_list issues
> will swamp any benefit you get from this.
I wouldn't see how this would be the case. The gist of this is:
Everytime you send a uevent into a network namespace *not* owned by
init_user_ns you currently *have* to take mutex_lock(uevent_sock_list)
effectively blocking the host from processing uevents even though
- the uevent you're receiving might be totally different from the
uevent that you're sending
- the uevent socket of the non-init_user_ns owned network namespace
isn't even recorded in the list.
The other argument is that we now have properly isolated network
namespaces wrt to uevents such that each netns can have its own set of
uevents. This can either happen by a sufficiently privileged userspace
process sending it uevents that are only dedicated to that specific
netns. Or - and this *has been true for a long time* - because network
devices are *properly namespaced*. Meaning a uevent for that network
device is *tied to a network namespace*. For both cases the uevent
sequence numbering will be absolutely misleading. For example, whenever
you create e.g. a new veth device in a new network namespace it
shouldn't be accounted against the initial network namespace but *only*
against the network namespace that has that device added to it.
Thanks!
Christian
Powered by blists - more mailing lists