[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHXqBFL+Ycw4-_LRcOCT0bjhALE3HfVMB7YfcoCruu=zW5PN-g@mail.gmail.com>
Date: Sat, 29 Dec 2012 17:01:01 +0100
From: Michał Mirosław <mirqus@...il.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Andrew Vagin <avagin@...allels.com>, netdev@...r.kernel.org,
vvs@...allels.com,
Michał Mirosław <mirq-linux@...e.qmqm.pl>
Subject: Re: Slow speed of tcp connections in a network namespace
2012/12/29 Eric Dumazet <eric.dumazet@...il.com>:
> On Sat, 2012-12-29 at 13:24 +0400, Andrew Vagin wrote:
>> We found a few nodes, where network works slow in containers.
>>
>> For testing speed of TCP connections we use wget, which downloads iso
>> images from the internet.
>>
>> wget in the new netns reports only 1.5 MB/s, but wget in the root netns
>> reports 33MB/s.
>>
>> A few facts:
>> * Experiments shows that window size for CT traffic does not increases
>> up to ~900, however for host traffic window size increases up to ~14000
>> * packets are shuffled in the netns sometimes.
>> * tso/gro/gso changes on interfaces does not help
>> * issue was _NOT_ reproduced if kernel booted with maxcpus=1 or bnx2.disable_msi=1
>>
>> I reduced steps to reproduce:
>> * Create a new network namespace "test" and a veth pair.
>> # ip netns add test
>> # ip link add name veth0 type veth peer name veth1
>>
>> * Move veth1 into the netns test
>> # ip link set veth1 netns test
>>
>> * Set ip address on veth1 and proper routing rules are added for this ip
>> in the root netns.
>> # ip link set up dev veth0; ip link set up dev veth0
>> # ip netns exec test ip a add REMOTE dev veth1
>> # ip netns exec test ip r a default via veth1
>> # ip r a REMOTE/32 via dev veth0
>>
>> Tcpdump for both cases are attached to this message.
>> tcpdump.host - wget in the root netns
>> tcpdump.netns.host - tcpdump for the host device, wget in the new netns
>> tcpdump.netns.veth - tcpdump for the veth1 device, wget in the new netns
>>
>> 3.8-rc1 is used for experiments.
>>
>> Do you have any ideas where is a problem?
>
> veth has absolutely no offload features
>
> It needs some care...
>
> At the very miminum, let TCP coalesce do its job by allowing SG
>
> CC Michał Mirosław <mirq-linux@...e.qmqm.pl> for insights.
veth is just like a tunnel device. In terms of offloads, it can do anything
we have software fallbacks for (in case packets get forwarded to real hardware).
> Please try following patch :
>
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 95814d9..9fefeb3 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -259,6 +259,10 @@ static const struct net_device_ops veth_netdev_ops = {
> .ndo_set_mac_address = eth_mac_addr,
> };
>
> +#define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_TSO | \
> + NETIF_F_HW_CSUM | NETIF_F_HIGHDMA | \
> + NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX)
> +
> static void veth_setup(struct net_device *dev)
> {
> ether_setup(dev);
> @@ -269,9 +273,10 @@ static void veth_setup(struct net_device *dev)
> dev->netdev_ops = &veth_netdev_ops;
> dev->ethtool_ops = &veth_ethtool_ops;
> dev->features |= NETIF_F_LLTX;
> + dev->features |= VETH_FEATURES;
> dev->destructor = veth_dev_free;
>
> - dev->hw_features = NETIF_F_HW_CSUM | NETIF_F_SG | NETIF_F_RXCSUM;
> + dev->hw_features = VETH_FEATURES;
> }
You missed NETIF_F_RXCSUM in VETH_FEATURES. We might support
NETIF_F_ALL_TSO, not just the IPv4 version.
Best Regards,
Michał Mirosław
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists