lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 13 Dec 2018 05:37:11 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Willy Tarreau <w@....eu>, Marek Majkowski <marek@...udflare.com>
Cc:     netdev@...r.kernel.org
Subject: Re: splice() performance for TCP socket forwarding



On 12/13/2018 04:55 AM, Willy Tarreau wrote:

> 
> It's quite strange, it doesn't match at all what I'm used to. In haproxy
> we're using splicing as well between sockets, and for medium to large
> objects we always get much better performance with splicing than without.
> 3 years ago during a test, we reached 60 Gbps on a 4-core machine using
> 2 40G NICs, which is not an exceptional sizing. And between processes on
> the loopback, numbers around 100G are totally possible. By the way this
> is one test you should start with, to verify if the issue is more on the
> splice side or on the NIC's side. It might be that your network driver is
> totally inefficient when used with GRO/GSO. In my case, multi-10G using
> ixgbe and 40G using mlx5 have always shown excellent results.

Maybe mlx5 driver is in LRO mode, packing TCP payload in 4K pages ?

bnx2x GRO/LRO has this mode, meaning that around 8 pages are used for a GRO packets of ~32 KB,
while mlx4 for instance would use one page frag for every ~1428 bytes of payload.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ