lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 10 Oct 2008 03:17:59 +0400
From:	Evgeniy Polyakov <s0mbre@...rvice.net.ru>
To:	netdev@...r.kernel.org
Cc:	linux-kernel@...r.kernel.org,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Ingo Molnar <mingo@...e.hu>, David Miller <davem@...emloft.net>
Subject: [tbench regression fixes]: digging out smelly deadmen.


Hi.

It was reported recently that tbench has a long history of regressions,
starting at least from 2.6.23 kernel. I verified that in my test
environment tbench 'lost' more than 100 MB/s from 470 down to 355
between at least 2.6.24 and 2.6.27. 2.6.26-2.6.27 performance regression
in my machines is rougly corresponds to 375 down to 355 MB/s.

I spent several days in various tests and bisections (unfortunately
bisect can not always point to the 'right' commit), and found following
problems.

First, related to the network, as lots of people expected: TSO/GSO over
loopback with tbench workload eats about 5-10 MB/s, since TSO/GSO frame
creation overhead is not paid by the optimized super-frame processing
gains. Since it brings really impressive improvement in big-packet
workload, it was (likely) decided not to add a patch for this, but
instead one can disable TSO/GSO via ethtool. This patch was added in
2.6.27 window, so it has its part in its regression.

Second part in the 26-27 window regression (I remind, it is about 20
MB/s) is related to the scheduler changes, which was expected by another
group of people. I tracked it down to the
a7be37ac8e1565e00880531f4e2aff421a21c803 commit, which, if being
reverted, returns 2.6.27 tbench perfromance to the highest (for
2.6.26-2.6.27) 365 MB/s mark. I also tested tree, stopped at above
commit itself, i.e. not 2.6.27, and got 373 MB/s, so likely another
changes in that merge ate couple of megs. Attached patch against 2.6.27.

Curious reader can ask, where did we lost another 100 MB/s? This small
issue was not detected (or at least reported in netdev@ with provocative
enough subject), and it happend to live somehere in 2.6.24-2.6.25 changes.
I was so lucky to 'guess' (just after couple of hundreds of compilations),
that it corresponds to 8f4d37ec073c17e2d4aa8851df5837d798606d6f commit about
high-resolution timers, attached patch against 2.6.25 brings tbench
performance for the 2.6.25 kernel tree to 455 MB/s.

There are still somewhat missed 20 MB/s, but 2.6.24 has 475 MB/s, so
likely bug lives between 2.6.24 and above 8f4d37ec073 commit.

I can test your patches (the most interesting attached one does not
apply clearly to the current tree) for the 2.6.27 tree tomorrow
(it is more than 3 A.M. in Moscow).

P.S. I'm not currently subscribed to any of the mentioned lists (and write
from long-ago-unused email), so can not find appropriate subject and reply
into the thread.

-- 
	Evgeniy Polyakov

View attachment "return-10mb-2.6.27.diff" of type "text/x-diff" (5994 bytes)

View attachment "return-80mb-2.6.25.diff" of type "text/x-diff" (17899 bytes)

Powered by blists - more mailing lists