[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1334132428.5300.2685.camel@edumazet-glaptop>
Date: Wed, 11 Apr 2012 10:20:28 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Alexander Duyck <alexander.h.duyck@...el.com>
Cc: Ian Campbell <ian.campbell@...rix.com>, netdev@...r.kernel.org,
David Miller <davem@...emloft.net>,
"Michael S. Tsirkin" <mst@...hat.com>,
Wei Liu <wei.liu2@...rix.com>, xen-devel@...ts.xen.org
Subject: Re: [PATCH 05/10] net: move destructor_arg to the front of sk_buff.
On Tue, 2012-04-10 at 12:15 -0700, Alexander Duyck wrote:
>
> Actually now that I think about it my concerns go much further than the
> memset. I'm convinced that this is going to cause a pretty significant
> performance regression on multiple drivers, especially on non x86_64
> architecture. What we have right now on most platforms is a
> skb_shared_info structure in which everything up to and including frag 0
> is all in one cache line. This gives us pretty good performance for igb
> and ixgbe since that is our common case when jumbo frames are not
> enabled is to split the head and place the data in a page.
I dont understand this split thing for MTU=1500 frames.
Even using half a page per fragment, each skb :
needs 2 allocations for sk_buff and skb->head, plus one page alloc /
reference.
skb->truesize = ksize(skb->head) + sizeof(*skb) + PAGE_SIZE/2 = 512 +
256 + 2048 = 2816 bytes
With non split you have :
2 allocations for sk_buff and skb->head.
skb->truesize = ksize(skb->head) + sizeof(*skb) = 2048 + 256 = 2304
bytes
less overhead and less calls to page allocator...
This only can benefit if GRO is on, since aggregation can use fragments
and a single sk_buff, instead of a frag_list
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists