lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1428374918.25985.206.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Mon, 06 Apr 2015 19:48:38 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Jan Engelhardt <jengelh@...i.de>
Cc:	Linux Networking Developer Mailing List <netdev@...r.kernel.org>
Subject: Re: TSO on veth device slows transmission to a crawl

On Tue, 2015-04-07 at 00:45 +0200, Jan Engelhardt wrote:
> I have here a Linux 3.19(.0) system where activated TSO on a veth slave 
> device makes IPv4-TCP transfers going into that veth-connected container 
> progress slowly.
> 
> 
> Host side (hv03):
> hv03# ip l
> 2: ge0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
> state UP mode DEFAULT group default qlen 1000 [Intel 82579LM]
> 7: ve-build01: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc 
> pfifo_fast state UP mode DEFAULT group default qlen 1000 [veth]
> hv03# ethtool -k ve-build01
> Features for ve-build01:
> rx-checksumming: on
> tx-checksumming: on
>         tx-checksum-ipv4: off [fixed]
>         tx-checksum-ip-generic: on
>         tx-checksum-ipv6: off [fixed]
>         tx-checksum-fcoe-crc: off [fixed]
>         tx-checksum-sctp: off [fixed]
> scatter-gather: on
>         tx-scatter-gather: on
>         tx-scatter-gather-fraglist: on
> tcp-segmentation-offload: on
>         tx-tcp-segmentation: on
>         tx-tcp-ecn-segmentation: on
>         tx-tcp6-segmentation: on
> udp-fragmentation-offload: on
> generic-segmentation-offload: on
> generic-receive-offload: on
> large-receive-offload: off [fixed]
> rx-vlan-offload: on
> tx-vlan-offload: on
> ntuple-filters: off [fixed]
> receive-hashing: off [fixed]
> highdma: on
> rx-vlan-filter: off [fixed]
> vlan-challenged: off [fixed]
> tx-lockless: on [fixed]
> netns-local: off [fixed]
> tx-gso-robust: off [fixed]
> tx-fcoe-segmentation: off [fixed]
> tx-gre-segmentation: on
> tx-ipip-segmentation: on
> tx-sit-segmentation: on
> tx-udp_tnl-segmentation: on
> fcoe-mtu: off [fixed]
> tx-nocache-copy: off
> loopback: off [fixed]
> rx-fcs: off [fixed]
> rx-all: off [fixed]
> tx-vlan-stag-hw-insert: on
> rx-vlan-stag-hw-parse: on
> rx-vlan-stag-filter: off [fixed]
> l2-fwd-offload: off [fixed]
> busy-poll: off [fixed]
> 
> 
> Guest side (build01):
> build01# ip l
> 2: host0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast 
> state UP mode DEFAULT group default qlen 1000
> build01# ethtool -k host0
> Features for host0:
> rx-checksumming: on
> tx-checksumming: on
>         tx-checksum-ipv4: off [fixed]
>         tx-checksum-ip-generic: on
>         tx-checksum-ipv6: off [fixed]
>         tx-checksum-fcoe-crc: off [fixed]
>         tx-checksum-sctp: off [fixed]
> scatter-gather: on
>         tx-scatter-gather: on
>         tx-scatter-gather-fraglist: on
> tcp-segmentation-offload: on
>         tx-tcp-segmentation: on
>         tx-tcp-ecn-segmentation: on
>         tx-tcp6-segmentation: on
> udp-fragmentation-offload: on
> generic-segmentation-offload: on
> generic-receive-offload: on
> large-receive-offload: off [fixed]
> rx-vlan-offload: on
> tx-vlan-offload: on
> ntuple-filters: off [fixed]
> receive-hashing: off [fixed]
> highdma: on
> rx-vlan-filter: off [fixed]
> vlan-challenged: off [fixed]
> tx-lockless: on [fixed]
> netns-local: off [fixed]
> tx-gso-robust: off [fixed]
> tx-fcoe-segmentation: off [fixed]
> tx-gre-segmentation: on
> tx-ipip-segmentation: on
> tx-sit-segmentation: on
> tx-udp_tnl-segmentation: on
> fcoe-mtu: off [fixed]
> tx-nocache-copy: off
> loopback: off [fixed]
> rx-fcs: off [fixed]
> rx-all: off [fixed]
> tx-vlan-stag-hw-insert: on
> rx-vlan-stag-hw-parse: on
> rx-vlan-stag-filter: off [fixed]
> l2-fwd-offload: off [fixed]
> busy-poll: off [fixed]
> 
> 
> Using an independent machine, I query a xinetd-chargen sample service
> to send a sufficient number of bytes through the pipe.
> 
> ares40# traceroute build01
> traceroute to build01 (x), 30 hops max, 60 byte packets
>  1  hv03 ()  0.713 ms  0.663 ms  0.636 ms
>  2  build01 ()  0.905 ms  0.882 ms  0.858 ms
> 
> ares40$ socat tcp4-connect:build01:19 - | pv >/dev/null
>  480KiB 0:00:05 [91.5KiB/s] [    <=>             ]
> 1.01MiB 0:00:11 [91.1KiB/s] [          <=>       ]
> 1.64MiB 0:00:18 [ 110KiB/s] [                <=> ]
> 
> (PV is the Pipe Viewer, showing throughput.)
> 
> It hovers between 80 and 110 kilobytes/sec, which is 600-fold lower
> than what I would normally see. Once TSO is turned off on the
> container-side interface:
> 
> build01# ethtool -K host0 tso off
> (must be host0 // doing it on ve-build01 has no effect)
> 
> I observe restoration of expected throughput:
> 
> ares40$ socat tcp4-connect:build01:19 - | pv >/dev/null
>  182MiB 0:02:05 [66.1MiB/s] [                       <=> ]
> 
> 
> This problem does not manifest when using IPv6.
> The problem also does not manifest if the TCP4 connection is kernel-local,
> e.g. hv03->build01.
> The problem also does not manifest if the TCP4 connection is outgoing, 
> e.g. build01->ares40.
> IOW, the tcp4 listening socket needs to be inside a veth-connected 
> container.

Hi Jan

Nothing comes to mind. It would help if you could provide a script to
reproduce the issue.

I've tried the following on current net-next :

lpaa23:~# cat veth.sh
#!/bin/sh
#This script has to be launched as root
#
brctl addbr br0
ip addr add 192.168.64.1/24 dev br0
ip link set br0 up
ip link add name ext0 type veth peer name int0
ip link set ext0 up
brctl addif br0 ext0
ip netns add vnode0
ip link set dev int0 netns vnode0
ip netns exec vnode0 ip addr add 192.168.64.2/24 dev int0 
ip netns exec vnode0 ip link set dev int0 up
ip link add name ext1 type veth peer name int0
ip link set ext1 up
brctl addif br0 ext1
ip netns add vnode1
ip link set dev int0 netns vnode1
ip netns exec vnode1 ip addr add 192.168.64.3/24 dev int0
ip netns exec vnode1 ip link set dev int0 up

ip netns exec vnode0 netserver &
sleep 1
ip netns exec vnode1 netperf -H 192.168.64.2 -l 10

# Cleanup
ip netns exec vnode0 killall netserver
ifconfig br0 down ; brctl delbr br0
ip netns delete vnode0 ; ip netns delete vnode1


lpaa23:~# ./veth.sh
Starting netserver with host 'IN(6)ADDR_ANY' port '12865' and family AF_UNSPEC
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.64.2 () port 0 AF_INET
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  16384  16384    10.00    14924.09   

Seems pretty honest result.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ