linux-kernel - Re: Possible netns creation and execution performance/scalability regression since v3.8 due to rcu callbacks being offloaded to multiple cpus

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140611161857.GC4581@linux.vnet.ibm.com>
Date:	Wed, 11 Jun 2014 09:18:57 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	chiluk@...onical.com
Cc:	Rafael Tinoco <rafael.tinoco@...onical.com>,
	linux-kernel@...r.kernel.org, davem@...emloft.net,
	ebiederm@...ssion.com,
	Christopher Arges <chris.j.arges@...onical.com>,
	Jay Vosburgh <jay.vosburgh@...onical.com>
Subject: Re: Possible netns creation and execution performance/scalability
 regression since v3.8 due to rcu callbacks being offloaded to multiple cpus

On Wed, Jun 11, 2014 at 10:46:00AM -0500, David Chiluk wrote:
> On 06/11/2014 10:17 AM, Rafael Tinoco wrote:
> > This script simulates a failure on a cloud infrastructure, for ex. As soon as
> > one virtualization host fails all its network namespaces have to be migrated
> > to other node. Creating thousands of netns in the shortest time possible
> > is the objective here. This regression was observed trying to migrate from
> > v3.5 to v3.8+.
> > 
> > Script creates up to 3000/4000 thousands network namespaces and places
> > links on them. Every 250 mark (netns already created) we have a throughput
> > average (how many were created per second up from last mark to this one).
> 
> Here's a little more background, and the "why it matters".

Thank you, this is quite helpful.

> In an openstack cloud, neutron *(openstack's networking framework) keeps
> all customers of the cloud separated via network namespaces.  On each
> compute node this is not a big deal, since each compute node can only
> handle at most a few hundred VMs.  However in order for neutron to route
> a customer's network traffic between disparate compute hosts, it uses
> the concept of a neutron gateway.  In order for customer A's vm on host
> 1 to talk to customer A's vm on host 2, it must first go through a gre
> tunnel to the neutron gateway.  The Neutron gateay then turns around and
> routes the network traffic over another gre tunnel to host 2.  The
> neutron gateway is where the problem is.
> 
> The neutron gateway must have a network namespace for every net
> namespace in the cloud.  Granted this collection can be split up by
> increasing the number of neutron gateways *(scaling out), but some
> clouds have decided to run these gateways on very beefy machines.  As
> you can see by the graph, there is a software limitation that prevents
> these machines from hosting any more than a few thousand namespaces.
> This makes the gateway's hardware severely under-utilized.
> 
> Now think about what happens when a gateway goes down, the namespaces
> need to be migrated, or a new machine needs to be brought up to replace
> it.  When we're talking about 3000 namespaces, the amount of time it
> takes simply to recreate the namespaces becomes very significant.
> 
> The script is a stripped down example of what exactly is being done on
> the neutron gateway in order to create namespaces.

Are the namespaces torn down and recreated one at a time, or is there some
syscall, ioctl(), or whatever that allows bulk tear down and recreating?

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/