netdev - Re: [PATCH 0/4][RFC] lro: Generic Large Receive Offload for TCP traffic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <46AE232F.3080409@myri.com>
Date:	Mon, 30 Jul 2007 13:43:11 -0400
From:	Andrew Gallatin <gallatin@...i.com>
To:	Linas Vepstas <linas@...tin.ibm.com>
CC:	Jan-Bernd Themann <ossthema@...ibm.com>,
	netdev <netdev@...r.kernel.org>,
	Thomas Klein <tklein@...ibm.com>,
	Jeff Garzik <jeff@...zik.org>,
	Jan-Bernd Themann <themann@...ibm.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-ppc <linuxppc-dev@...abs.org>,
	Christoph Raisch <raisch@...ibm.com>,
	Marcus Eder <meder@...ibm.com>,
	Stefan Roscher <stefan.roscher@...ibm.com>,
	David Miller <davem@...emloft.net>
Subject: Re: [PATCH 0/4][RFC] lro: Generic Large Receive Offload for TCP traffic

Here is a quick reply before something more official can
be written up:

Linas Vepstas wrote:

 > -- what is LRO?

Large Receive Offload

 > -- Basic principles of operation?

LRO is analogous to a receive side version of TSO.  The NIC (or
driver) merges several consecutive segments from the same connection,
fixing up checksums, etc.  Rather than up to 45 separate 1500 byte
frames (meaning up to 45 trips through the network stack), the driver
merges them into one 65212 byte frame.  It currently works only
with TCP over IPv4.

LRO was, AFAIK, first though of by Neterion.  They had a paper about
it at OLS2005.
http://www.linuxinsight.com/files/ols2005/grossman-reprint.pdf

 > -- Can I use it in my driver?

Yes, it can be used in any driver.

 > -- Does my hardware have to have some special feature before I can 
use it?

No.

 > -- What sort of performance improvement does it provide? Throughput?
 >    Latency? CPU usage? How does it affect DMA allocation? Does it
 >    improve only a certain type of traffic (large/small packets, etc.)

The benefit is directly proportional to the packet rate.

See my reply to the previous RFC for performance information.  The
executive summary is that for the myri10ge 10GbE driver on low end
hardware with 1500b frames, I've seen it increase throughput by a
factor of nearly 2.5x, while at the same time reducing CPU utilization
by 17%.  The affect for jumbo frames is less dramatic, but still
impressive (1.10x, 14% CPU reduction)

You can achieve better speedups if your driver receives into
high-order pages.

 > -- Example code? What's the API? How should my driver use it?

The 3/4 in this patch showed an example of converting a driver
to use LRO for skb based receive buffers.   I'm working on
a patch for myri10ge that shows how you would use it in a driver
which receives into pages.

Cheers,

Drew
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html