linux-kernel - Re: [PATCH v2] tcp: splice as many packets as possible at once

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090202.235017.253437221.davem@davemloft.net>
Date:	Mon, 02 Feb 2009 23:50:17 -0800 (PST)
From:	David Miller <davem@...emloft.net>
To:	jarkao2@...il.com
Cc:	herbert@...dor.apana.org.au, zbr@...emap.net, w@....eu,
	dada1@...mosbay.com, ben@...s.com, mingo@...e.hu,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	jens.axboe@...cle.com
Subject: Re: [PATCH v2] tcp: splice as many packets as possible at once

From: Jarek Poplawski <jarkao2@...il.com>
Date: Mon, 2 Feb 2009 08:43:58 +0000

> On Mon, Feb 02, 2009 at 12:18:54AM -0800, David Miller wrote:
> > Allocating 4096 or 8192 bytes for a 1500 byte frame is wasteful.
> 
> I mean allocating chunks of cached pages similarly to sk_sndmsg_page
> way. I guess the similar problem is to be worked out in any case. But
> it seems doing it on the linear area requires less changes in other
> places.

This is a very interesting idea, but it has some drawbacks:

1) Just like any other allocator we'll need to find a way to
   handle > PAGE_SIZE allocations, and thus add handling for
   compound pages etc.

   And exactly the drivers that want such huge SKB data areas
   on receive should be converted to use scatter gather page
   vectors in order to avoid multi-order pages and thus strains
   on the page allocator.

2) Space wastage and poor packing can be an issue.

   Even with SLAB/SLUB we get poor packing, look at Evegeniy's
   graphs that he made when writing his NTA patches.

Now, when choosing a way to move forward, I'm willing to accept a
little bit of the issues in #2 for the sake of avoiding the
issues in #1 above.

Jarek, note that we can just keep your current splice() copy hacks in
there.  And as a result we can have an easier to handle migration
path.  We just do the page RX allocation conversions in the drivers
where performance really matters, for hardware a lot of people have.

That's a lot smoother and has less issues that converting the system
wide SKB allocator upside down.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/