lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <217e3fa9-7782-08c7-1f2b-8dabacaa83f9@gmail.com>
Date:   Fri, 16 Aug 2019 14:35:06 +0200
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Juliana Rodrigueiro <juliana.rodrigueiro@...ra2net.com>,
        netdev@...r.kernel.org
Cc:     edumazet@...gle.com, hkallweit1@...il.com
Subject: Re: r8169: Performance regression and latency instability



On 8/16/19 2:09 PM, Juliana Rodrigueiro wrote:
> Greetings!
> 
> During migration from kernel 3.14 to 4.19, we noticed a regression on the network performance. Under the exact same circumstances, the standard deviation of the latency is more than double than before on the Realtek RTL8111/8168B (10ec:8168) using the r8169 driver.
> 
> Kernel 3.14:
>     # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O STDDEV_LATENCY -m 64K -d Send
>     313.37
> 
> Kernel 4.19:
>     # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O STDDEV_LATENCY -m 64K -d Send
>     632.96
> 
> In contrast, we noticed small improvements in performance with other non-Realtek network cards (igb, tg3). Which suggested a possible driver related bug.
> 
> However after bisecting the code, I ended up with the following patch, which was introduced in kernel 4.17 and modifies net/ipv4:
> 
>     commit 0a6b2a1dc2a2105f178255fe495eb914b09cb37a
>     Author: Eric Dumazet <edumazet@...gle.com>
>     Date:   Mon Feb 19 11:56:47 2018 -0800
> 
>         tcp: switch to GSO being always on
> 
> Could you please help me to clarify, should GSO be always on on my device? Or does it just affect TCP? According to ethtool it is always off, "ethtool -K eth0 gso on" has no effect, unless I switch SG on.
> 
>     # ethtool -k eth0
>     Offload parameters for eth0:
>     Cannot get device udp large send offload settings: Operation not supported
>     rx-checksumming: on
>     tx-checksumming: off
>     scatter-gather: off
>     tcp-segmentation-offload: off
>     udp-fragmentation-offload: off
>     generic-segmentation-offload: off
>     generic-receive-offload: on
>     large-receive-offload: off
> 
> I validated that reverting "tcp: switch to GSO being always on" successfully brings back the better performance for the r8169 driver.
> 
> I'm sure that reverting that commit is not the optimal solution, so I would like to kindly ask for help to shed some light in this issue.

Hi Juliana

I am sure that all commits done in TCP stack can show a regression on a particular
combination of packet sizes, MTU size, NIC, and measured metric.

Basically if your NIC does not support SG and TSO, there is a possibility
that the changes we did to enter the era of 100Gbit and 200Gbit NIC might
hurt a bit.

Lack of SG means that the lower stack might have to perform memory  allocations
to perform the segmentation and this might be slow (or even fail) under memory pressure.

I have no idea why you can even turn on SG, if it is turned off by default.

Please give us more information on the NIC

ethtool -i eth0 ; ifconfig eth0

Possibly try to use a recent ethtool, it seems yours is pretty old.

I also see this relevant commit : I have no idea why SG would have any relation with TSO.

commit a7eb6a4f2560d5ae64bfac98d79d11378ca2de6c
Author: Holger Hoffstätte <holger@...lied-asynchrony.com>
Date:   Fri Aug 9 00:02:40 2019 +0200

    r8169: fix performance issue on RTL8168evl
    
    Disabling TSO but leaving SG active results is a significant
    performance drop. Therefore disable also SG on RTL8168evl.
    This restores the original performance.
    
    Fixes: 93681cd7d94f ("r8169: enable HW csum and TSO")
    Signed-off-by: Holger Hoffstätte <holger@...lied-asynchrony.com>
    Signed-off-by: Heiner Kallweit <hkallweit1@...il.com>
    Signed-off-by: David S. Miller <davem@...emloft.net>

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index b2a275d8504c..912bd41eaa1b 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -6898,9 +6898,9 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 
        /* RTL8168e-vl has a HW issue with TSO */
        if (tp->mac_version == RTL_GIGA_MAC_VER_34) {
-               dev->vlan_features &= ~NETIF_F_ALL_TSO;
-               dev->hw_features &= ~NETIF_F_ALL_TSO;
-               dev->features &= ~NETIF_F_ALL_TSO;
+               dev->vlan_features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
+               dev->hw_features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
+               dev->features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
        }
 
        dev->hw_features |= NETIF_F_RXALL;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ