netdev - Re: Linux TCP's Robustness to Multipath Packet Reordering

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1303730701.2747.110.camel@edumazet-laptop>
Date:	Mon, 25 Apr 2011 13:25:01 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Dominik Kaspar <dokaspar.ietf@...il.com>
Cc:	netdev@...r.kernel.org
Subject: Re: Linux TCP's Robustness to Multipath Packet Reordering

Le lundi 25 avril 2011 à 12:37 +0200, Dominik Kaspar a écrit :
> Hello,
> 
> Knowing how critical packet reordering is for standard TCP, I am
> currently testing how robust Linux TCP is when packets are forwarded
> over multiple paths (with different bandwidth and RTT). Since Linux
> TCP adapts its "dupAck threshold" to an estimated level of packet
> reordering, I expect it to be much more robust than a standard TCP
> that strictly follows the RFCs. Indeed, as you can see in the
> following plot, my experiments show a step-wise adaptation of Linux
> TCP to heavy reordering. After many minutes, Linux TCP finally reaches
> a data throughput close to the perfect aggregated data rate of two
> paths (emulated with characteristics similar to IEEE 802.11b (WLAN)
> and a 3G link (HSPA)):
> 
> http://home.simula.no/~kaspar/static/mptcp-emu-wlan-hspa-00.png
> 
> Does anyone have clues what's going on here? Why does the aggregated
> throughput increase in steps? And what could be the reason it takes
> minutes to adapt to the full capacity, when in other cases, Linux TCP
> adapts much faster (for example if the bandwidth of both paths are
> equal). I would highly appreciate some advice from the netdev
> community.
> 
> Implementation details:
> This multipath TCP experiment ran between a sending machine with a
> single Ethernet interface (eth0) and a client with two Ethernet
> interfaces (eth1, eth2). The machines are connected through a switch
> and tc/netem is used to emulate the bandwidth and RTT of both paths.
> TCP connections are established using iperf between eth0 and eth1 (the
> primary path). At the sender, an iptables' NFQUEUE is used to "spoof"
> the destination IP address of outgoing packets and force some to
> travel to eth2 instead of eth1 (the secondary path). This multipath
> scheduling happens in proportion to the emulated bandwidths, so if the
> paths are set to 500 and 1000 KB/s, then packets are distributed in a
> 1:2 ratio. At the client, iptables' RAWDNAT is used to translate the
> spoofed IP addresses back to their original, so that all packets end
> up at eth1, although a portion actually travelled to eth2. ACKs are
> not scheduled over multiple paths, but always travel back on the
> primary path. TCP does not notice anything of the multipath
> forwarding, except the side-effect of packet reordering, which can be
> huge if the path RTTs are set very differently.
> 

Hi Dominik

Implementation details of the tc/netem stages are important to fully
understand how TCP stack can react.

Is TSO active at sender side for example ?

Your results show that only some exceptional events make bandwidth
really change.

A tcpdump/pcap of ~10.000 first packets would be nice to provide (not on
mailing list, but on your web site)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html