lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 25 Apr 2011 14:59:12 +0200
From:	Carsten Wolff <carsten@...ffcarsten.de>
To:	Dominik Kaspar <dokaspar.ietf@...il.com>
Cc:	netdev@...r.kernel.org
Subject: Re: Linux TCP's Robustness to Multipath Packet Reordering

Hi Dominik,

On Monday 25 April 2011, Dominik Kaspar wrote:
> Hello,
> 
> Knowing how critical packet reordering is for standard TCP, I am
> currently testing how robust Linux TCP is when packets are forwarded
> over multiple paths (with different bandwidth and RTT). Since Linux
> TCP adapts its "dupAck threshold" to an estimated level of packet
> reordering, I expect it to be much more robust than a standard TCP
> that strictly follows the RFCs. Indeed, as you can see in the
> following plot, my experiments show a step-wise adaptation of Linux
> TCP to heavy reordering. After many minutes, Linux TCP finally reaches
> a data throughput close to the perfect aggregated data rate of two
> paths (emulated with characteristics similar to IEEE 802.11b (WLAN)
> and a 3G link (HSPA)):
> 
> http://home.simula.no/~kaspar/static/mptcp-emu-wlan-hspa-00.png
> 
> Does anyone have clues what's going on here? Why does the aggregated
> throughput increase in steps? And what could be the reason it takes
> minutes to adapt to the full capacity, when in other cases, Linux TCP
> adapts much faster (for example if the bandwidth of both paths are
> equal). I would highly appreciate some advice from the netdev
> community.

the throughput increase in steps is most likely caused by Linux's reordering 
detection and quantization. The DupThresh (tp->reordering) is only increased 
when reordering is detected and is then set to a value that depends on current 
inflight/pipe. This means, on a path with only reordering and no loss, where a 
very large DupThresh is best, you will see those steps in the throughput 
everytime when Linux detects reordering during a time where cwnd is large. 
This on the other hand depends purely on timing/luck.
Linux is also only able to quantize reordering during disorder state, which 
leaves out many possible quantization samples, escpecially the larger ones, 
which would increase DupThresh to higher values.

Also, reordering detection depends very much on TCP options. Which TCP Options 
were enabled in your test? Timestamps? D-SACK?

Carsten

> 
> Implementation details:
> This multipath TCP experiment ran between a sending machine with a
> single Ethernet interface (eth0) and a client with two Ethernet
> interfaces (eth1, eth2). The machines are connected through a switch
> and tc/netem is used to emulate the bandwidth and RTT of both paths.
> TCP connections are established using iperf between eth0 and eth1 (the
> primary path). At the sender, an iptables' NFQUEUE is used to "spoof"
> the destination IP address of outgoing packets and force some to
> travel to eth2 instead of eth1 (the secondary path). This multipath
> scheduling happens in proportion to the emulated bandwidths, so if the
> paths are set to 500 and 1000 KB/s, then packets are distributed in a
> 1:2 ratio. At the client, iptables' RAWDNAT is used to translate the
> spoofed IP addresses back to their original, so that all packets end
> up at eth1, although a portion actually travelled to eth2. ACKs are
> not scheduled over multiple paths, but always travel back on the
> primary path. TCP does not notice anything of the multipath
> forwarding, except the side-effect of packet reordering, which can be
> huge if the path RTTs are set very differently.
> 
> Best regards,
> Dominik
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
           /\-ยด-/\
          (  @ @  )
________o0O___^___O0o________
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ