[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <217e3fa9-7782-08c7-1f2b-8dabacaa83f9@gmail.com>
Date: Fri, 16 Aug 2019 14:35:06 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Juliana Rodrigueiro <juliana.rodrigueiro@...ra2net.com>,
netdev@...r.kernel.org
Cc: edumazet@...gle.com, hkallweit1@...il.com
Subject: Re: r8169: Performance regression and latency instability
On 8/16/19 2:09 PM, Juliana Rodrigueiro wrote:
> Greetings!
>
> During migration from kernel 3.14 to 4.19, we noticed a regression on the network performance. Under the exact same circumstances, the standard deviation of the latency is more than double than before on the Realtek RTL8111/8168B (10ec:8168) using the r8169 driver.
>
> Kernel 3.14:
> # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O STDDEV_LATENCY -m 64K -d Send
> 313.37
>
> Kernel 4.19:
> # netperf -v 2 -P 0 -H <netserver-IP>,4 -I 99,5 -t omni -l 1 -- -O STDDEV_LATENCY -m 64K -d Send
> 632.96
>
> In contrast, we noticed small improvements in performance with other non-Realtek network cards (igb, tg3). Which suggested a possible driver related bug.
>
> However after bisecting the code, I ended up with the following patch, which was introduced in kernel 4.17 and modifies net/ipv4:
>
> commit 0a6b2a1dc2a2105f178255fe495eb914b09cb37a
> Author: Eric Dumazet <edumazet@...gle.com>
> Date: Mon Feb 19 11:56:47 2018 -0800
>
> tcp: switch to GSO being always on
>
> Could you please help me to clarify, should GSO be always on on my device? Or does it just affect TCP? According to ethtool it is always off, "ethtool -K eth0 gso on" has no effect, unless I switch SG on.
>
> # ethtool -k eth0
> Offload parameters for eth0:
> Cannot get device udp large send offload settings: Operation not supported
> rx-checksumming: on
> tx-checksumming: off
> scatter-gather: off
> tcp-segmentation-offload: off
> udp-fragmentation-offload: off
> generic-segmentation-offload: off
> generic-receive-offload: on
> large-receive-offload: off
>
> I validated that reverting "tcp: switch to GSO being always on" successfully brings back the better performance for the r8169 driver.
>
> I'm sure that reverting that commit is not the optimal solution, so I would like to kindly ask for help to shed some light in this issue.
Hi Juliana
I am sure that all commits done in TCP stack can show a regression on a particular
combination of packet sizes, MTU size, NIC, and measured metric.
Basically if your NIC does not support SG and TSO, there is a possibility
that the changes we did to enter the era of 100Gbit and 200Gbit NIC might
hurt a bit.
Lack of SG means that the lower stack might have to perform memory allocations
to perform the segmentation and this might be slow (or even fail) under memory pressure.
I have no idea why you can even turn on SG, if it is turned off by default.
Please give us more information on the NIC
ethtool -i eth0 ; ifconfig eth0
Possibly try to use a recent ethtool, it seems yours is pretty old.
I also see this relevant commit : I have no idea why SG would have any relation with TSO.
commit a7eb6a4f2560d5ae64bfac98d79d11378ca2de6c
Author: Holger Hoffstätte <holger@...lied-asynchrony.com>
Date: Fri Aug 9 00:02:40 2019 +0200
r8169: fix performance issue on RTL8168evl
Disabling TSO but leaving SG active results is a significant
performance drop. Therefore disable also SG on RTL8168evl.
This restores the original performance.
Fixes: 93681cd7d94f ("r8169: enable HW csum and TSO")
Signed-off-by: Holger Hoffstätte <holger@...lied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@...il.com>
Signed-off-by: David S. Miller <davem@...emloft.net>
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index b2a275d8504c..912bd41eaa1b 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -6898,9 +6898,9 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
/* RTL8168e-vl has a HW issue with TSO */
if (tp->mac_version == RTL_GIGA_MAC_VER_34) {
- dev->vlan_features &= ~NETIF_F_ALL_TSO;
- dev->hw_features &= ~NETIF_F_ALL_TSO;
- dev->features &= ~NETIF_F_ALL_TSO;
+ dev->vlan_features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
+ dev->hw_features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
+ dev->features &= ~(NETIF_F_ALL_TSO | NETIF_F_SG);
}
dev->hw_features |= NETIF_F_RXALL;
Powered by blists - more mailing lists