[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090112124545.GA10893@ioremap.net>
Date: Mon, 12 Jan 2009 15:45:46 +0300
From: Evgeniy Polyakov <zbr@...emap.net>
To: Herbert Xu <herbert@...dor.apana.org.au>
Cc: Jarek Poplawski <jarkao2@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Jens Axboe <jens.axboe@...cle.com>, Willy Tarreau <w@....eu>,
Changli Gao <xiaosuo@...il.com>, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: Data corruption issue with splice() on 2.6.27.10
On Mon, Jan 12, 2009 at 11:02:57PM +1100, Herbert Xu (herbert@...dor.apana.org.au) wrote:
> > > Hmm... in any case: take 3
> >
> > Yes this should fix the corruption but it kind of defeats the
> > purpose of splice by copying the data.
>
> However, as we don't have a better fix yet, we probably should
> take Jarek's patch for now since data corruption is bad.
Iirc it copies data from skb to the new pipe page unconditionally while
it is needed only for skb->sendpage path, although it is not possible to
know what is the other side of the pipe (or not?).
What about storing a callback and private pointer in the shared info for
the skb and clone them during usual clone, and invoke the callback at
shared info freeing time, which in turn will call spd->spd_release()?
Given that we only need to protect linear part, it should be simple
enough and we will not need to mess with the pskb_expand* calls.
> This is a very hard problem, so in the end the only viable solution
> might be to get the drivers to switch to using page frags, like
> the Intel page split method.
As a long-term solution this sounds as the best case, but introduces
quite heavy overhead for the allocators. Right now we allocate
1500+shared_info rounded up to the nearest power of the two (2k), but
then we will either need to have own network allocator (I have one :) or
allocate PAGE_SIZE+shared_info rounded up to the pwoer of the two (i.e.
8k), which is unfeasible.
--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists