netdev - Re: Kernel 4.19 network performance - forwarding/routing normal users traffic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <abcb6fde-4b08-73ff-02c9-01609a41a087@itcare.pl>
Date:   Sat, 10 Nov 2018 23:04:22 +0100
From:   Paweł Staszewski <pstaszewski@...are.pl>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
Cc:     Saeed Mahameed <saeedm@...lanox.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Kernel 4.19 network performance - forwarding/routing normal users
 traffic



W dniu 10.11.2018 o 22:53, Paweł Staszewski pisze:
>
>
> W dniu 10.11.2018 o 22:01, Jesper Dangaard Brouer pisze:
>> On Sat, 10 Nov 2018 21:02:10 +0100
>> Paweł Staszewski <pstaszewski@...are.pl> wrote:
>>
>>> W dniu 10.11.2018 o 20:34, Jesper Dangaard Brouer pisze:
>>>> I want you to experiment with:
>>>>
>>>>    ethtool --set-priv-flags DEVICE rx_striding_rq off
>>> just checked that previously connectx4 was have thos disabled:
>>>    ethtool --show-priv-flags enp175s0f0
>>>
>>> Private flags for enp175s0f0:
>>> rx_cqe_moder       : on
>>> tx_cqe_moder       : off
>>> rx_cqe_compress    : off
>>> rx_striding_rq     : off
>>> rx_no_csum_complete: off
>>>
>> The CX4 hardware does not have this feature (p.s. the CX4-Lx does).
>>
>>> So now we are on connectx5 and we have enabled - for sure connectx5
>>> changed cpu load - where i have now max 50/60% cpu where with connectx4
>>> there was sometimes near 100% with same configuration.
>> I (strongly) believe the CPU load was related to the page-alloactor
>> lock congestion, that Aaron fixed.
>>
> Yes i think both - most problems with cpu was due to page-allocator 
> problems.
> But also after change connctx4 to connectx5 there is cpu load 
> difference - about 10% in total - but yes most of this like 40% is 
> cause of Aaron patch :) - rly good job :)
>
>
> Now im messing with ring configuration for connectx5 nics.
> And after reading that paper:
> https://netdevconf.org/2.1/slides/apr6/network-performance/ 
> 04-amir-RX_and_TX_bulking_v2.pdf
>
> changed from RX:8192 / TX: 4096 to RX:8192 / TX: 256
>
> after this i gain about 5Gbit/s RX and TX traffic and less cpu load....
> before change there was 59/59 Gbit/s
>
> After change there is 64/64 Gbit/s
>
>  bwm-ng v0.6.1 (probing every 1.000s), press 'h' for help
>   input: /proc/net/dev type: rate
>   |         iface                   Rx Tx                Total
> ============================================================================== 
>
>          enp175s0:          44.45 Gb/s           19.69 Gb/s           
> 64.14 Gb/s
>          enp216s0:          19.69 Gb/s           44.49 Gb/s           
> 64.19 Gb/s
> ------------------------------------------------------------------------------ 
>
>             total:          64.14 Gb/s           64.18 Gb/s 128.33 Gb/s
>
>
Also after this change kernel freed some memory... like 500MB

Still squeezed but less with more traffic...

CPU          total/sec     dropped/sec    squeezed/sec 
collision/sec      rx_rps/sec  flow_limit/sec
CPU:00               0               0               0 0               
0               0
CPU:01               0               0               0 0               
0               0
CPU:02               0               0               0 0               
0               0
CPU:03               0               0               0 0               
0               0
CPU:04               0               0               0 0               
0               0
CPU:05               0               0               0 0               
0               0
CPU:06               0               0               0 0               
0               0
CPU:07               0               0               0 0               
0               0
CPU:08               0               0               0 0               
0               0
CPU:09               0               0               0 0               
0               0
CPU:10               0               0               0 0               
0               0
CPU:11               0               0               0 0               
0               0
CPU:12               0               0               0 0               
0               0
CPU:13               0               0               0 0               
0               0
CPU:14          389270               0              41 0               
0               0
CPU:15          375543               0              32 0               
0               0
CPU:16          385847               0              22 0               
0               0
CPU:17          412293               0              34 0               
0               0
CPU:18          401287               0              30 0               
0               0
CPU:19          368345               0              30 0               
0               0
CPU:20          395452               0              28 0               
0               0
CPU:21          374032               0              38 0               
0               0
CPU:22          342036               0              32 0               
0               0
CPU:23          374773               0              34 0               
0               0
CPU:24          356139               0              31 0               
0               0
CPU:25          392725               0              32 0               
0               0
CPU:26          385937               0              37 0               
0               0
CPU:27          385282               0              37 0               
0               0
CPU:28               0               0               0 0               
0               0
CPU:29               0               0               0 0               
0               0
CPU:30               0               0               0 0               
0               0
CPU:31               0               0               0 0               
0               0
CPU:32               0               0               0 0               
0               0
CPU:33               0               0               0 0               
0               0
CPU:34               0               0               0 0               
0               0
CPU:35               0               0               0 0               
0               0
CPU:36               0               0               0 0               
0               0
CPU:37               0               0               0 0               
0               0
CPU:38               0               0               0 0               
0               0
CPU:39               0               0               0 0               
0               0
CPU:40               0               0               0 0               
0               0
CPU:41               0               0               0 0               
0               0
CPU:42          340817               0              33 0               
0               0
CPU:43          364805               0              42 0               
0               0
CPU:44          298484               0              29 0               
0               0
CPU:45          292798               0              30 0               
0               0
CPU:46          301739               0              24 0               
0               0
CPU:47          275116               0              20 0               
0               0
CPU:48          319237               0              34 0               
0               0
CPU:49          290350               0              29 0               
0               0
CPU:50          307084               0              30 0               
0               0
CPU:51          332908               0              24 0               
0               0
CPU:52          300151               0              24 0               
0               0
CPU:53          310140               0              28 0               
0               0
CPU:54          341788               0              28 0               
0               0
CPU:55          320344               0              28 0               
0               0

Summed:        9734722               0             860 0               
0               0

>
>
>
>
>
>
>
>
>
>