lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <m1bpijjywm.fsf@fess.ebiederm.org>
Date:	Mon, 30 Nov 2009 16:55:37 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	David Miller <davem@...emloft.net>
Cc:	netdev@...r.kernel.org, hadi@...erus.ca, dlezcano@...ibm.com,
	adobriyan@...il.com, kaber@...sh.net
Subject: Re: [PATCH 0/20] Batch network namespace cleanup

David Miller <davem@...emloft.net> writes:

> From: ebiederm@...ssion.com (Eric W. Biederman)
> Date: Sun, 29 Nov 2009 17:46:03 -0800
>
>> Recently Jamal and Daniel perform some experiments and found that
>> large numbers of network namespace exiting simultaneously is very
>> inefficient.  24+ minutes in some configurations.  The cpu overhead
>> was negligible but it results in long hold times of net_mutex, and
>> memory being consumed a long time after the last user has gone away.
>> 
>> I looked into it and discovered that by batching network namespace
>> cleanups I can reduce the time for 4k network namespaces exiting from
>> 5-7 minutes in my configuration to 44 seconds.
>> 
>> This patch series is my set of changes to the network namespace core
>> and associated cleanups to allow for network namespace batching.
>
> All applied, and assuming all of the build checks pass I'll
> push this out to net-next-2.6
>
> I should look into that inet_twsk_purge performance issue you mention
> when tearing down a namespace.  It walks the entire hash table and
> takes a lock for every hash chain.
>
> Eric, is it possible for us to at least slightly optimize this by
> doing a peek at whether the head pointer of each chain is NULL and
> bypass the spinlock and everything else in that case?  Or is this
> not legal with sk_nulls?
>
> Something like:
>
> 	if (hlist_nulls_empty(&head->twchain))
> 		continue;
>
> right before the 'restart' label?

I haven't had a chance to wrap my head around that case fully yet.

After playing with a few ideas I think what we want to do algorithmically
is to have a batched flush like we do for the routing table cache.  That
should get the cost down to only about 100ms.  Which is much better when
you have a lot of them but is still a lot of time.

>From my preliminary investigation I believe we don't need to take any
locks to traverse the hash table.  I think we can do the entire hash
table traversal under simple rcu protection, and only take the lock on
the individual time wait entries to delete them.

....

Also for best batching we have ipip, ipgre, ip6_tunnel, sit, vlan,
and bridging that need to be taught to use rtnl_link_ops and let
the generic code delete their devices.  

The changes for the vlan code are simple.  The rest I haven't finished
wrapping my head around the drivers individual requirements for.   Still
the changes should not be sweeping.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ