lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 07 May 2011 20:32:54 +0200 From: Eric Dumazet <eric.dumazet@...il.com> To: Alex Bligh <alex@...x.org.uk> Cc: netdev@...r.kernel.org Subject: Re: Scalability of interface creation and deletion Le samedi 07 mai 2011 à 19:24 +0100, Alex Bligh a écrit : > Eric, > > --On 7 May 2011 18:26:29 +0200 Eric Dumazet <eric.dumazet@...il.com> wrote: > > > Here, on 2.6.38 kernel (Ubuntu 11.04 provided, on my 2 core laptop) > ># time rmmod dummy > > real 0m0.111s > ... > > On another machine with a very recent kernel : > > $ modprobe dummy numdummies=1 > > $ ifconfig dummy0 192.168.46.46 up > > $ time rmmod dummy > > > > real 0m0.032s > > I know it's different machines, but that's a pretty significant > difference. So I compiled from 2.6.39-rc6 head (i.e. a kernel > less than an hour old), with only your suggested change in, > so that (a) I could eliminate old kernels, and (b) I could > instrument it. > > > synchronize_rcu() calls are not consuming cpu, they just _wait_ > > rcu grace period. > > > > I suggest you read Documentation/RCU files if you really want to :) > > I understand the basic point: it needs to wait for all readers > to drop their references. It's sort of hard to understand why > on a machine with an idle network there would be reader(s) holding > references for 250ms. And indeed the analysis below shows that > isn't the case (it's more like 44 ms). > > > If you want to check how expensive it is, its quite easy: > > add a trace in synchronize_net() > > At least for veth devices, I see the same on 2.6.39-rc6 - if anything > it's worse: > > # ./ifseq -n 100 > Sat May 7 17:50:53 UTC 2011 creating 100 interfaces > Sat May 7 17:50:54 UTC 2011 done > > real 0m1.549s > user 0m0.060s > sys 0m0.990s > Sat May 7 17:50:54 UTC 2011 deleting 100 interfaces > Sat May 7 17:51:22 UTC 2011 done > > real 0m27.917s > user 0m0.420s > sys 0m0.060s > > Performing that operation produced exactly 200 calls to synchronize net. > The timestamps indicate that's 2 per veth pair deletion, and zero > per veth pair creation. > > Analysing the resultant logs shows only 31% of the problem is > time spent within synchronize_net() (perl script below). > > $ ./analyse.pl < syncnet | tail -2 > Total 18.98515 Usage 199 Average 0.09540 elsewhere > Total 8.77581 Usage 200 Average 0.04388 synchronizing > > So *something* is spending more than twice as much time as > synchronize_net(). > > I've attached the log below as well. > > -- > Alex Bligh > > > $ cat analyse.pl > #!/usr/bin/perl > > use strict; > use warnings; > > my $lastuptime; > my $uptime; > my $diff; > my $area; > my %time; > my %usage; > > while (<>) > { > chomp; > if (m/\[\s*([0-9.]+)\].*synchronize_net/) > { > $uptime = $1; > if (defined($lastuptime)) > { > $area = (m/end/)?"synchronizing":"elsewhere"; > $diff = $uptime - $lastuptime; > printf "%5.5f $area\n", $diff; > $time{$area}+=$diff; > $usage{$area}++; > } > $lastuptime = $uptime; > } > } > > print "\n"; > > my $k; > foreach $k (sort keys %time) > { > printf "Total %5.5f Usage %d Average %5.5f %s\n", $time{$k}, > $usage{$k}, $time{$k}/$usage{$k}, $k; > } > > > > May 7 17:50:55 nattytest kernel: [ 127.490142] begin synchronize_net() > May 7 17:50:55 nattytest kernel: [ 127.560084] end synchronize_net() > May 7 17:50:55 nattytest kernel: [ 127.610350] begin synchronize_net() > May 7 17:50:55 nattytest kernel: [ 127.610932] end synchronize_net() > May 7 17:50:55 nattytest kernel: [ 127.740078] begin synchronize_net() > May 7 17:50:55 nattytest kernel: [ 127.820071] end synchronize_net() Well, there is also one rcu_barrier() call that is expensive. (It was changed from one synchronize_rcu() to one rcu_barrier() lately in commit ef885afb , in 2.6.36 kernel) net/core/dev.c line 5167 http://git2.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ef885afbf8a37689afc1d9d545e2f3e7a8276c17 netdev_wait_allrefs() waits that all references to a device vanishes. It currently uses a _very_ pessimistic 250 ms delay between each probe. Some users reported that no more than 4 devices can be dismantled per second, this is a pretty serious problem for some setups. Most of the time, a refcount is about to be released by an RCU callback, that is still in flight because rollback_registered_many() uses a synchronize_rcu() call instead of rcu_barrier(). Problem is visible if number of online cpus is one, because synchronize_rcu() is then a no op. time to remove 50 ipip tunnels on a UP machine : before patch : real 11.910s after patch : real 1.250s Reported-by: Nicolas Dichtel <nicolas.dichtel@...nd.com> Reported-by: Octavian Purdila <opurdila@...acom.com> Reported-by: Benjamin LaHaise <bcrl@...ck.org> Signed-off-by: Eric Dumazet <eric.dumazet@...il.com> Signed-off-by: David S. Miller <davem@...emloft.net> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists