linux-kernel - Re: [PATCH v2] tcp: splice as many packets as possible at once

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090205033241.a99121fe.billfink@mindspring.com>
Date:	Thu, 5 Feb 2009 03:32:41 -0500
From:	Bill Fink <billfink@...dspring.com>
To:	Willy Tarreau <w@....eu>
Cc:	David Miller <davem@...emloft.net>, herbert@...dor.apana.org.au,
	zbr@...emap.net, jarkao2@...il.com, dada1@...mosbay.com,
	ben@...s.com, mingo@...e.hu, linux-kernel@...r.kernel.org,
	netdev@...r.kernel.org, jens.axboe@...cle.com
Subject: Re: [PATCH v2] tcp: splice as many packets as possible at once

On Wed, 4 Feb 2009, Willy Tarreau wrote:

> On Wed, Feb 04, 2009 at 01:01:46AM -0800, David Miller wrote:
> > From: Herbert Xu <herbert@...dor.apana.org.au>
> > Date: Wed, 4 Feb 2009 19:59:07 +1100
> > 
> > > On Wed, Feb 04, 2009 at 09:54:32AM +0100, Willy Tarreau wrote:
> > > >
> > > > My server is running 2.4 :-), but I observed the same issues with older
> > > > 2.6 as well. I can certainly imagine that things have changed a lot since,
> > > > but the initial point remains : jumbo frames are expensive to deal with,
> > > > and with recent NICs and drivers, we might get close performance for
> > > > little additional cost. After all, initial justification for jumbo frames
> > > > was the devastating interrupt rate and all NICs coalesce interrupts now.
> > > 
> > > This is total crap! Jumbo frames are way better than any of the
> > > hacks (such as GSO) that people have come up with to get around it.
> > > The only reason we are not using it as much is because of this
> > > nasty thing called the Internet.
> > 
> > Completely agreed.
> > 
> > If Jumbo frames are slower, it is NOT some fundamental issue.  It is
> > rather due to some misdesign of the hardware or it's driver.
> 
> Agreed we can't use them *because* of the internet, but this
> limitation has forced hardware designers to find valid alternatives.
> For instance, having the ability to reach 10 Gbps with 1500 bytes
> frames on myri10ge with a low CPU usage is a real achievement. This
> is "only" 800 kpps after all.
> 
> And the arbitrary choice of 9k for jumbo frames was total crap too.
> It's clear that no hardware designer was involved in the process.
> They have to stuff 16kB of RAM on a NIC to use only 9. And we need
> to allocate 3 pages for slightly more than 2. 7.5 kB would have been
> better in this regard.
> 
> I still find it nice to lower CPU usage with frames larger than 1500,
> but given the fact that this is rarely used (even in datacenters), I
> think our efforts should concentrate on where the real users are, ie
> <1500.

Those in the HPC realm use 9000 byte jumbo frames because it makes
a major performance difference, especially across large RTT paths,
and the Internet2 backbone fully supports 9000 byte jumbo frames
(with some wishing we could support much larger frame sizes).

Local environment:

9000 byte jumbo frames:

[root@...g2 ~]# nuttcp -w10m 192.168.88.16
11818.1875 MB /  10.01 sec = 9905.9707 Mbps 100 %TX 76 %RX 0 retrans 0.15 msRTT

4080 byte MTU:

[root@...g2 ~]# nuttcp -w10m 192.168.88.16
 9171.6875 MB /  10.02 sec = 7680.7663 Mbps 100 %TX 99 %RX 0 retrans 0.19 msRTT

The performance impact is even more pronounced on a large RTT path
such as the following netem emulated 80 ms RTT path:

9000 byte jumbo frames:

[root@...g2 ~]# nuttcp -T30 -w80m 192.168.89.15
25904.2500 MB /  30.16 sec = 7205.8755 Mbps 96 %TX 55 %RX 0 retrans 82.73 msRTT

4080 byte MTU:

[root@...g2 ~]# nuttcp -T30 -w80m 192.168.89.15
 8650.0129 MB /  30.25 sec = 2398.8862 Mbps 33 %TX 19 %RX 2371 retrans 81.98 msRTT

And if there's any loss in the path, the performance difference is also
dramatic, such as here across a real MAN environment with about a 1 ms RTT:

9000 byte jumbo frames:

[root@...nce9 ~]# nuttcp -w20m 192.168.88.8
 7711.8750 MB /  10.05 sec = 6436.2406 Mbps 82 %TX 96 %RX 261 retrans 0.92 msRTT

4080 byte MTU:

[root@...nce9 ~]# nuttcp -w20m 192.168.88.8
 4551.0625 MB /  10.08 sec = 3786.2108 Mbps 50 %TX 95 %RX 42 retrans 0.95 msRTT

All testing was with myri10ge on the transmitter side (2.6.20.7 kernel).

So my experience has definitely been that 9000 byte jumbo frames are a
major performance win for high throughput applications.

						-Bill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/