lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <6.2.5.6.2.20111005020421.03a9e6c0@binnacle.cx>
Date:	Wed, 05 Oct 2011 02:11:27 -0400
From:	starlight@...nacle.cx
To:	Joe Perches <joe@...ches.com>, Christoph Lameter <cl@...two.org>,
	Serge Belyshev <belyshev@...ni.sinp.msu.ru>,
	Con Kolivas <kernel@...ivas.org>
Cc:	Eric Dumazet <eric.dumazet@...il.com>,
	linux-kernel@...r.kernel.org, netdev <netdev@...r.kernel.org>,
	Willy Tarreau <w@....eu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Stephen Hemminger <stephen.hemminger@...tta.com>
Subject: Re: big picture UDP/IP performance question re 2.6.18
  -> 2.6.32

Gremlins!

In my haste I overlooked that changing the
thread pool from 140 to 13 is radical,
especially with a core count matching
the thread count.  Apples and oranges.

Ran the small-pool test on the other two
kernels and here are the results.  The
user and system columns are in jiffies.

kernel          total      user   system
2.6.18(rhel5)   02:07:16  615516  148152 (19.4%)
2.6.39.4        02:27:44  658074  228420 (25.7%)
2.6.39.4(bfs)   02:34:49  899936   29000 (3%)

So BFS performs somewhat worse than the default
scheduler on total-CPU.  The old RHEL 5 kernel
is still the winner, but not by nearly as
much in the small thread-pool scenario--.39
is only 16% slower than .18 with the
system overhead being 55% worse rather
than 100% worse.

So all that is shown after all is that
the differential between the older and
newer kernels is strongly influenced by
the number of active threads, and the O(N)
aspect of BFS makes it an inappropriate
choice for heavy multithreaded workloads.

Also small thread pools are more efficient
than large ones (especially when the workflow
is routed for optimal cache locality as
is the case here).  However the small
pool does not scale as well to maximum CPU
load as the large pool (a different test)
which is why it is no longer used in
production and was only dusted off
in order to enable the BFS comparison.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ