lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 6 Mar 2013 10:52:05 +0100
From:	Johannes Rudolph <johannes.rudolph@...glemail.com>
To:	netdev@...r.kernel.org
Subject: Spinlock spinning in __inet_hash_connect

Hello all,

I hope I'm on the correct mailing list for raising this issue. We are
seeing an issue while running a load test with jmeter against a web
server [1]. The test suite uses 50 threads to connect to a localhost
web server, runs one http request per connection and then loops. What
happens is that after the test runs for about 10 seconds (~ 100000
connections established / closed) the CPU load goes up and connection
rates slow down massively (see [1] for a chart). With `perf top` I'm
observing this on the _client_ side:

 41.39%  [kernel]                                    [k] __ticket_spin_lock
 16.83%  [kernel]                                    [k]
__inet_check_established
 12.50%  [kernel]                                    [k] __inet_hash_connect
  4.35%  [kernel]                                    [k] __ticket_spin_unlock

I've also recorded a call graph a log of which you can find in [2].
This was on Ubuntu 12.10 Linux 3.6.3-030603-generic x86_64. The same
test run against another webserver doesn't show this behavior under
the particular setup.

I've found a related issue for totally different application and setup
[3]. The problem seems related to handing out ephemeral ports when
there are only few ephemeral ports available and (I guess) there's
much congestion on the ephemeral ports hashtable. As suggested in [3]
setting `tcp_tw_reuse=1` seems to fix the issue in the particular test
case but it could be that it is only because it takes pressure from
the ports available.

Before doing more research I wanted to put that here for the record
and for suggestions how to proceed further. What I could do:

 * run the test on a more recent kernel (3.8.2)
 * provide you with instructions how to reproduce the behavior
 * upload the `perf report` if that helps

Thanks,

--
Johannes

[1] https://groups.google.com/d/topic/spray-user/76klWTHtsr4/discussion
[2] https://gist.github.com/jrudolph/5098113
[3] https://bugs.launchpad.net/percona-playback/+bug/1059330

-----------------------------------------------
Johannes Rudolph
http://virtual-void.net
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ