[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090206091034.GA4879@ff.dom.local>
Date: Fri, 6 Feb 2009 09:10:34 +0000
From: Jarek Poplawski <jarkao2@...il.com>
To: David Miller <davem@...emloft.net>
Cc: herbert@...dor.apana.org.au, zbr@...emap.net, w@....eu,
dada1@...mosbay.com, ben@...s.com, mingo@...e.hu,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
jens.axboe@...cle.com
Subject: Re: [PATCH v2] tcp: splice as many packets as possible at once
On Thu, Feb 05, 2009 at 11:52:58PM -0800, David Miller wrote:
> From: Jarek Poplawski <jarkao2@...il.com>
> Date: Tue, 3 Feb 2009 09:41:08 +0000
>
> > Yes, this looks reasonable. On the other hand, I think it would be
> > nice to get some opinions of slab folks (incl. Evgeniy) on the expected
> > efficiency of such a solution. (It seems releasing with put_page() will
> > always have some cost with delayed reusing and/or waste of space.)
>
> I think we can't avoid using carved up pages for skb->data in the end.
> The whole kernel wants to speak in pages and be able to grab and
> release them in one way and one way only (get_page() and put_page()).
>
> What do you think is more likely? Us teaching the whole entire kernel
> how to hold onto SKB linear data buffers, or the networking fixing
> itself to operate on pages for it's header metadata? :-)
This idea looks very reasonable, except I wander why nobody else
didn't need this kind of mm interface. Another question is it seems
many mechanisms like fast searching, defragmentation etc. could be
reused.
> What we'll end up with is likely a hybrid scheme. High speed devices
> will receive into pages. And also the skb->data area will be page
> backed and held using get_page()/put_page() references.
>
> It is not even worth optimizing for skb->data holding the entire
> packet, that's not the case that matters.
>
> These skb->data areas will thus be 128 bytes plus the skb_shinfo
> structure blob. They also will be recycled often, rather than held
> onto for long periods of time.
Looks fine, except: you mentioned dumb NICs, which would need this
page space on receive, anyway. BTW, don't they need this on transmit
again?
> In fact we can optimize that even further in many ways, for example by
> dropping the skb->data backed memory once the skb is queued to the
> socket receive buffer. That will make skb->data buffer lifetimes
> miniscule even under heavy receive load.
>
> In that kind of situation, doing even the most stupidest page slicing
> algorithm, similar to what we do now with sk->sk_sndmsg_page, is
> more than adequate and things like NTA (purely to solve this problem)
> is overengineering.
Hmm... I don't get it. It seems these slabs do a lot of advanced work,
and still some people like Evgeniy or Nick thought it's not enough,
and even found it worth of their time to rework this.
There is also a question of memory accounting: do you think admins
don't care if we give away say 25% additionally?
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists