lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKA=qzb6kLTeJW4U42J12Fq50_54TFBHgEOcR3DyZgg_LoLtdQ@mail.gmail.com>
Date:	Tue, 19 Apr 2016 10:13:28 -0500
From:	Josh Hunt <joshhunt00@...il.com>
To:	"Butler, Peter" <pbutler@...usnet.com>
Cc:	Rick Jones <rick.jones2@....com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Poorer networking performance in later kernels?

On Tue, Apr 19, 2016 at 9:54 AM, Butler, Peter <pbutler@...usnet.com> wrote:
>> -----Original Message-----
>> From: Rick Jones [mailto:rick.jones2@....com]
>> Sent: April-15-16 6:37 PM
>> To: Butler, Peter <pbutler@...usnet.com>; netdev@...r.kernel.org
>> Subject: Re: Poorer networking performance in later kernels?
>>
>> On 04/15/2016 02:02 PM, Butler, Peter wrote:
>>> (Please keep me CC'd to all comments/responses)
>>>
>>> I've tried a kernel upgrade from 3.4.2 to 4.4.0 and see a marked drop
>>> in networking performance.  Nothing was changed on the test systems,
>>> other than the kernel itself (and kernel modules).  The identical
>>> .config used to build the 3.4.2 kernel was brought over into the
>>> 4.4.0 kernel source tree, and any configuration differences (e.g. new
>>> parameters, etc.) were taken as default values.
>>>
>>> The testing was performed on the same actual hardware for both kernel
>>> versions (i.e. take the existing 3.4.2 physical setup, simply boot
>>> into the (new) kernel and run the same test).  The netperf utility
>>> was used for benchmarking and the testing was always performed on
>>> idle systems.
>>>
>>> TCP testing yielded the following results, where the 4.4.0 kernel
>>> only got about 1/2 of the throughput:
>>>
>>
>>>         Recv     Send       Send                          Utilization       Service Demand
>>>         Socket   Socket     Message Elapsed               Send     Recv     Send    Recv
>>>         Size     Size       Size    Time       Throughput local    remote   local   remote
>>>         bytes    bytes      bytes   secs.      10^6bits/s % S      % S      us/KB   us/KB
>>>
>>> 3.4.2 13631488 13631488   8952    30.01      9370.29    10.14    6.50     0.709   0.454
>>> 4.4.0 13631488 13631488   8952    30.02      5314.03    9.14     14.31    1.127   1.765
>>>
>>> SCTP testing yielded the following results, where the 4.4.0 kernel only got about 1/3 of the throughput:
>>>
>>>         Recv     Send       Send                          Utilization       Service Demand
>>>         Socket   Socket     Message Elapsed               Send     Recv     Send    Recv
>>>         Size     Size       Size    Time       Throughput local    remote   local   remote
>>>         bytes    bytes      bytes   secs.      10^6bits/s  % S     % S      us/KB   us/KB
>>>
>>> 3.4.2 13631488 13631488   8952    30.00      2306.22    13.87    13.19    3.941   3.747
>>> 4.4.0 13631488 13631488   8952    30.01       882.74    16.86    19.14    12.516  14.210
>>>
>>> The same tests were performed a multitude of time, and are always
>>> consistent (within a few percent).  I've also tried playing with
>>> various run-time kernel parameters (/proc/sys/kernel/net/...) on the
>>> 4.4.0 kernel to alleviate the issue but have had no success at all.
>>>
>>> I'm at a loss as to what could possibly account for such a discrepancy...
>>>
>>
>> I suspect I am not alone in being curious about the CPU(s) present in the systems and the model/whatnot of the NIC being used.  I'm also curious as to why you have what at first glance seem like absurdly large socket buffer sizes.
>>
>> That said, it looks like you have some Really Big (tm) increases in service demand.  Many more CPU cycles being consumed per KB of data transferred.
>>
>> Your message size makes me wonder if you were using a 9000 byte MTU.
>>
>> Perhaps in the move from 3.4.2 to 4.4.0 you lost some or all of the stateless offloads for your NIC(s)?  Running ethtool -k <interface> on both ends under both kernels might be good.
>>
>> Also, if you did have a 9000 byte MTU under 3.4.2 are you certain you still had it under 4.4.0?
>>
>> It would (at least to me) also be interesting to run a TCP_RR test comparing the two kernels.  TCP_RR (at least with the default request/response size of one byte) doesn't really care about stateless offloads or MTUs and could show how much difference there is in basic path length (or I suppose in interrupt coalescing behaviour if the NIC in question has a mildly dodgy heuristic for such things).
>>
>> happy benchmarking,
>>
>> rick jones
>>
>
>
> I think the issue is resolved.  I had to recompile my 4.4.0 kernel with a few options pertaining to the Intel NIC which somehow (?) got left out or otherwise clobbered when I ported my 3.4.2 .config to the 4.4.0 kernel source tree.  With those changes now in I see essentially identical performance with the two kernels.  Sorry for any confusion and/or waste of time here.  My bad.
>
>

Can you share which config options you enabled to get your performance back?

-- 
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ