lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4ADF2B57.4030708@gmail.com>
Date:	Wed, 21 Oct 2009 17:40:07 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Octavian Purdila <opurdila@...acom.com>
CC:	Benjamin LaHaise <bcrl@...et.ca>, netdev@...r.kernel.org,
	Cosmin Ratiu <cratiu@...acom.com>
Subject: [PATCH] net: allow netdev_wait_allrefs() to run faster

Octavian Purdila a écrit :
> On Sunday 18 October 2009 21:21:44 you wrote:
>>> The msleep(250) should be tuned first. Then if this is really necessary
>>> to dismantle 100.000 netdevices per second, we might have to think a bit
>>> more. 
>>> Just try msleep(1 or 2), it should work quite well.
>> My goal is tearing down 100,000 interfaces in a few seconds, which really
>>  is  necessary.  Right now we're running about 40,000 interfaces on a not
>>  yet saturated 10Gbps link.  Going to dual 10Gbps links means pushing more
>>  than 100,000 subscriber interfaces, and it looks like a modern dual socket
>>  system can handle that.
>>
> 
> I would also like to see this patch in, we are running into scalability issues 
> with creating/deleting lots of interfaces as well.

Ben patch only address interface deletion, and one part of the problem,
maybe the more visible one for the current kernel.

Adding lots of interfaces only needs several threads to run concurently.

Before applying/examining his patch I suggest identifying all dev_put() spots than
can be deleted and replaced by something more scalable. I began this job
but others can help me.

RTNL and rcu grace periods are going to hurt anyway, so you probably need
to use many tasks to be able to delete lots of interfaces in parallel.

netdev_run_todo() should also use a better algorithm to allow parallelism.

Following patch doesnt slow down dev_put() users and real scalability
problems will surface and might be addressed.

[PATCH] net: allow netdev_wait_allrefs() to run faster

netdev_wait_allrefs() waits that all references to a device vanishes.

It currently uses a _very_ pessimistic 250 ms delay between each probe.
Some users report that no more than 4 devices can be dismantled per second,
this is a pretty serious problem for extreme setups.

Most likely, references only wait for a rcu grace period that should come
fast, so use a schedule_timeout_uninterruptible(1) to allow faster recovery.

Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
---
 net/core/dev.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 28b0b9e..fca2e4a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4983,7 +4983,7 @@ static void netdev_wait_allrefs(struct net_device *dev)
 			rebroadcast_time = jiffies;
 		}
 
-		msleep(250);
+		schedule_timeout_uninterruptible(1);
 
 		if (time_after(jiffies, warning_time + 10 * HZ)) {
 			printk(KERN_EMERG "unregister_netdevice: "
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ