[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090112125914.GA12142@ioremap.net>
Date: Mon, 12 Jan 2009 15:59:14 +0300
From: Evgeniy Polyakov <zbr@...emap.net>
To: Herbert Xu <herbert@...dor.apana.org.au>
Cc: Jarek Poplawski <jarkao2@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Jens Axboe <jens.axboe@...cle.com>, Willy Tarreau <w@....eu>,
Changli Gao <xiaosuo@...il.com>, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: Data corruption issue with splice() on 2.6.27.10
On Mon, Jan 12, 2009 at 11:56:07PM +1100, Herbert Xu (herbert@...dor.apana.org.au) wrote:
> > As a long-term solution this sounds as the best case, but introduces
> > quite heavy overhead for the allocators. Right now we allocate
> > 1500+shared_info rounded up to the nearest power of the two (2k), but
> > then we will either need to have own network allocator (I have one :) or
> > allocate PAGE_SIZE+shared_info rounded up to the pwoer of the two (i.e.
> > 8k), which is unfeasible.
>
> No that's not what I was suggesting. The page split model allocates
> an skb with a very small head that accomodates only the headers.
> All payload is stored in the frags structure.
>
> For 1500-byte packets, we can manage the payload area efficiently
> by dividing each allocated page into 2K chunks. The page will
> then be automatically freed once all the 2K chunks on it have been
> freed through the page ref count mechanism.
That's the part I referred to as a network own allocator. We can have
multiple kmem_caches though for the popular MTUs and round up the
requested size otherwise.
--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists