[<prev] [next>] [day] [month] [year] [list]
Date: Thu, 22 Nov 2007 14:16:42 +0900 (JST)
From: Ryousei Takano <takano-ryousei@...t.go.jp>
To: netdev@...r.kernel.org
Subject: [RFC][PATCH 0/3] PSPacer qdisc module
Hi all,
I sent this mail yesterday, but it did not be delivered. So I resend it.
I am sorry if you receive duplicate mails.
What is PSPacer?
PSPacer (Precise Software Pacer) is a qdisc module which realizes
precise transmission bandwidth control. It makes bursty traffic which is
often generated by TCP smooth without any special hardware.
Bursty traffic can degrade the communication performance, because it
causes buffer overflow at intermediate network nodes and results in
packet losses. In a bursty traffic, packets are sent back to back. By
adding a short pause in between the packets, traffic bursts can be
avoided.
PSPacer controls the interval between outgoing packets very precisely.
The key idea of PSPacer is to determine transmission timing of packets
by the number of bytes transferred. If packets are transferred back to
back, the timing a packet is sent can be determined by the number of
bytes sent before the packet. PSPacer fills the gaps between time
aligned "real packets" (the packets which are sent by user program) by
"gap packets". The real packets and gap packets are sent back to back,
and thus the timing of transmission of each real packet can be precisely
controlled by adjusting the gap packet size. As the gap packets, the IEEE
802.3x PAUSE frames are used. PAUSE frames are discarded at a switch
input port, and only real packets go through the switch keeping the
original intervals.
In the past, some software-based pacing schemes have been proposed.
These schemes use timer interrupt based packet transmission timing control.
Therefore, to achieve precise pacing, they require the operating system
to maintain a high resolution timer, which could incur a large overhead.
The patchset consists of two parts: one part is to be applied to the Linux
kernel, and the other is to be applied to the iproute2.
For detailed description and the usage of PSPacer, please refer to
our project page (http://www.gridmpi.org/gridtcp.jsp), and the paper
"Design and Evaluation of Precise Software Pacing Mechanisms for Fast
Long-Distance Networks," in PFLDnet2005.
Usage
- setup qdiscs
(add the PSPacer qdisc as the root qdisc)
# /sbin/tc qdisc add dev eth0 root handle 1: psp default 1
(add the PSPacer class whose target rate is 500Mbps)
# /sbin/tc class add dev eth0 parent 1: classid 1:1 psp rate 500mbit
(add the PFIFO qdisc as the sub qdisc)
# /sbin/tc qdisc add dev eth0 parent 1:1 handle 10: pfifo
- run iperf (to confirm the effect of PSPacer)
$ iperf -c 192.168.1.2 -i 10 -t 60
------------------------------------------------------------
Client connecting to 192.168.1.2, TCP port 5122
TCP window size: 16.0 KByte (default)
iperf shows payload bandwidth. 476Mbps is the payload bandwidth
when the physical layer bandwidth is 500Mbps and packet size is
1500Bytes
------------------------------------------------------------
[ 3] local 192.168.1.1 port 46457 connected with 192.168.1.2 port 5122
[ 3] 0.0-10.0 sec 567 MBytes 476 Mbits/sec
[ 3] 10.0-20.0 sec 567 MBytes 476 Mbits/sec
- cleanup qdiscs
(remove the PFIFO sub qdisc)
# /sbin/tc qdisc del dev eth0 parent 1:1 handle 10:
(remove the PSPacer class)
# /sbin/tc class del dev eth0 parent 1: classid 1:1
(remove the PSPacer qdisc)
# /sbin/tc qdisc del dev eth0 root handle 1:
(remove the PSPacer module)
# /sbin/rmmod sch_psp
Limitations
(1) PSPacer controls the bandwidth according to the ratio of the target
bandwidth in the maximum transmission bandwidth of the system.
Therefore, the system (computer, network interface, operating system,
buffer settings, etc.) should have a capability to transmit packets at
the maximum transmission rate (i.e. 1 Gbps for 1000BASE, 100
Mbps for 100BASE) to realize a precise pacing.
Therefore, if you want to control Gigabit Ethernet traffic, we recommend
to use PCI-X, 66MHz/64bit PCI or CSA connected network interface. If
the total of target bandwidth of the output streams is less than 100Mbps,
you can set the network interface to use 100BASE mode so as to obtain
precise pacing.
For the same reason, avoid using a shared switch (dumb hub) for the edge
switch to which the PC with PSPacer is connected.
(2) PSPacer uses the IEEE 802.3x PAUSE frame as the gap between packets.
Therefore, you can not use the PAUSE frame to stop transmission from the
switch/router to the PC. Since PSPacer generates PAUSE frames with zero
pause time, there should not be any side effects other than you can not
stop transmission from the switch. However, it is recommended to disable
IEEE 802.3x flow control function of the switch (to which a PC with
PSPacer is connected) in order to avoid unexpected behavior.
(3) PSPacer does not support TCP Segmentation Offloading (TSO). You have
to disable TSO by using the ethtool command (ethtool -K eth0 tso off).
Best regards,
Ryousei Takano
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists