linux-kernel - Re: big picture UDP/IP performance question re 2.6.18 -> 2.6.32

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-Id: <6.2.5.6.2.20111005020421.03a9e6c0@binnacle.cx>
Date:	Wed, 05 Oct 2011 02:11:27 -0400
From:	starlight@...nacle.cx
To:	Joe Perches <joe@...ches.com>, Christoph Lameter <cl@...two.org>,
	Serge Belyshev <belyshev@...ni.sinp.msu.ru>,
	Con Kolivas <kernel@...ivas.org>
Cc:	Eric Dumazet <eric.dumazet@...il.com>,
	linux-kernel@...r.kernel.org, netdev <netdev@...r.kernel.org>,
	Willy Tarreau <w@....eu>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Stephen Hemminger <stephen.hemminger@...tta.com>
Subject: Re: big picture UDP/IP performance question re 2.6.18
  -> 2.6.32

Gremlins!

In my haste I overlooked that changing the
thread pool from 140 to 13 is radical,
especially with a core count matching
the thread count.  Apples and oranges.

Ran the small-pool test on the other two
kernels and here are the results.  The
user and system columns are in jiffies.

kernel          total      user   system
2.6.18(rhel5)   02:07:16  615516  148152 (19.4%)
2.6.39.4        02:27:44  658074  228420 (25.7%)
2.6.39.4(bfs)   02:34:49  899936   29000 (3%)

So BFS performs somewhat worse than the default
scheduler on total-CPU.  The old RHEL 5 kernel
is still the winner, but not by nearly as
much in the small thread-pool scenario--.39
is only 16% slower than .18 with the
system overhead being 55% worse rather
than 100% worse.

So all that is shown after all is that
the differential between the older and
newer kernels is strongly influenced by
the number of active threads, and the O(N)
aspect of BFS makes it an inappropriate
choice for heavy multithreaded workloads.

Also small thread pools are more efficient
than large ones (especially when the workflow
is routed for optimal cache locality as
is the case here).  However the small
pool does not scale as well to maximum CPU
load as the large pool (a different test)
which is why it is no longer used in
production and was only dusted off
in order to enable the BFS comparison.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/