[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090114085308.GB4234@ff.dom.local>
Date: Wed, 14 Jan 2009 08:53:08 +0000
From: Jarek Poplawski <jarkao2@...il.com>
To: Herbert Xu <herbert@...dor.apana.org.au>
Cc: David Miller <davem@...emloft.net>, zbr@...emap.net,
dada1@...mosbay.com, w@....eu, ben@...s.com, mingo@...e.hu,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
jens.axboe@...cle.com
Subject: Re: [PATCH] tcp: splice as many packets as possible at once
On Wed, Jan 14, 2009 at 07:26:30PM +1100, Herbert Xu wrote:
> On Tue, Jan 13, 2009 at 11:27:10PM -0800, David Miller wrote:
> >
> > So while trying to figure out a sane way to fix this, I found
> > another bug:
> >
> > /*
> > * map the linear part
> > */
> > if (__splice_segment(virt_to_page(skb->data),
> > (unsigned long) skb->data & (PAGE_SIZE - 1),
> > skb_headlen(skb),
> > offset, len, skb, spd))
> > return 1;
> >
> > This will explode if the SLAB cache for skb->head is using compound
> > (ie. order > 0) pages.
> >
> > For example, if this is an order-1 page being used for the skb->head
> > data (which would be true on most systems for jumbo MTU frames being
> > received into a linear SKB), the offset will be wrong and depending
> > upon skb_headlen() we could reference past the end of that
> > non-compound page we will end up grabbing a reference to.
>
> I'm actually not worried so much about these packets since these
> drivers should be converted to skb frags as otherwise they'll
> probably stop working after a while due to memory fragmentation.
>
> But yeah for correctness we definitely should address this in
> skb_splice_bits.
>
> I still think Jarek's approach (the copying one) is probably the
> easiest for now until we can find a better way.
>
Actually, I still think my second approach (the PageSlab) is probably
(if tested) the easiest for now, because it should fix the reported
(Willy's) problem, without any change or copy overhead for splice to
file (which could be still wrong, but not obviously wrong). Then we
could search for the only right way (which is most probably around
Herbert's new skb page allocator. IMHO "my" "copying approach" is too
risky e.g. for stable etc. because of unknown memory requirements,
especially for some larger size page configs/systems.
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists