linux-kernel - Re: [PATCH v2] tcp: splice as many packets as possible at once

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090126.221056.174077798.davem@davemloft.net>
Date:	Mon, 26 Jan 2009 22:10:56 -0800 (PST)
From:	David Miller <davem@...emloft.net>
To:	zbr@...emap.net
Cc:	jarkao2@...il.com, herbert@...dor.apana.org.au, w@....eu,
	dada1@...mosbay.com, ben@...s.com, mingo@...e.hu,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	jens.axboe@...cle.com
Subject: Re: [PATCH v2] tcp: splice as many packets as possible at once

From: Evgeniy Polyakov <zbr@...emap.net>
Date: Tue, 27 Jan 2009 00:21:30 +0300

> Hi Jarek.
> 
> On Mon, Jan 26, 2009 at 08:20:36AM +0000, Jarek Poplawski (jarkao2@...il.com) wrote:
> > > 1. Network (tree) allocator
> > > http://www.ioremap.net/projects/nta
> > 
> > I looked at this a bit, but alas I didn't find much for this Herbert's
> > idea of payload in fragments/pages. Maybe some kind of API RFC is
> > needed before this resurrection?
> 
> Basic idea is to steal some (probably a lot) pages from the slab
> allocator and put network buffers there without strict need for
> power-of-two alignment and possible wraps when we add skb_shared_info at
> the end, so that old e1000 driver required order-4 allocations for the
> jumbo frames. We can do that in alloc_skb() and friends and put returned
> buffers into skb's fraglist and updated reference counters for those
> pages; and with additional copy of the network headers into skb->head.

We are going back and forth saying the same thing, I think :-)
(BTW, I think NTA is cool and we might do something like that
eventually)

The basic thing we have to do is make the drivers receive into
pages, and then slide the network headers (only) into the linear
SKB data area.

Even for drivers like NIU and myri10ge that do this, they only
use heuristics or some fixed minimum to decide how much to
move to the linear area.

Result?  Some data payload bits end up there because it overshoots.

Since we have pskb_may_pull() calls everywhere necessary, which
means not in eth_type_trans(), we could just make these drivers
(and future drivers converted to operate in this way) only
put the ethernet headers there initially.

Then the rest of the stack will take care of moving the network
and transport payloads there, as necessary.

I bet it won't even hurt latency or routing/firewall performance.

I did test this with the NIU driver at one point, and it did not
change TCP latency nor throughput at all even at 10g speeds.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/