[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1232613395.11429.122.camel@ymzhang>
Date: Thu, 22 Jan 2009 16:36:34 +0800
From: "Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
To: Christoph Lameter <cl@...ux-foundation.org>
Cc: Andi Kleen <andi@...stfloor.org>,
Pekka Enberg <penberg@...helsinki.fi>,
Matthew Wilcox <matthew@....cx>,
Nick Piggin <nickpiggin@...oo.com.au>,
Andrew Morton <akpm@...ux-foundation.org>,
netdev@...r.kernel.org, sfr@...b.auug.org.au,
matthew.r.wilcox@...el.com, chinang.ma@...el.com,
linux-kernel@...r.kernel.org, sharad.c.tripathi@...el.com,
arjan@...ux.intel.com, suresh.b.siddha@...el.com,
harita.chilukuri@...el.com, douglas.w.styner@...el.com,
peter.xihong.wang@...el.com, hubert.nueckel@...el.com,
chris.mason@...cle.com, srostedt@...hat.com,
linux-scsi@...r.kernel.org, andrew.vasquez@...gic.com,
anirban.chakraborty@...gic.com
Subject: Re: Mainline kernel OLTP performance update
On Wed, 2009-01-21 at 18:58 -0500, Christoph Lameter wrote:
> On Tue, 20 Jan 2009, Zhang, Yanmin wrote:
>
> > kmem_cache skbuff_head_cache's object size is just 256, so it shares the kmem_cache
> > with :0000256. Their order is 1 which means every slab consists of 2 physical pages.
>
> That order can be changed. Try specifying slub_max_order=0 on the kernel
> command line to force an order 0 alloc.
I tried slub_max_order=0 and there is no improvement on this UDP-U-4k issue.
Both get_page_from_freelist and __free_pages_ok's cpu time are still very high.
I checked my instrumentation in kernel and found it's caused by large object allocation/free
whose size is more than PAGE_SIZE. Here its order is 1.
The right free callchain is __kfree_skb => skb_release_all => skb_release_data.
So this case isn't the issue that batch of allocation/free might erase partial page
functionality.
'#slaninfo -AD' couldn't show statistics of large object allocation/free. Can we add
such info? That will be more helpful.
In addition, I didn't find such issue wih TCP stream testing.
>
> The queues of the page allocator are of limited use due to their overhead.
> Order-1 allocations can actually be 5% faster than order-0. order-0 makes
> sense if pages are pushed rapidly to the page allocator and are then
> reissues elsewhere. If there is a linear consumption then the page
> allocator queues are just overhead.
>
> > Page allocator has an array at zone_pcp(zone, cpu)->pcp to keep a page buffer for page order 0.
> > But here skbuff_head_cache's order is 1, so UDP-U-4k couldn't benefit from the page buffer.
>
> That usually does not matter because of partial list avoiding page
> allocator actions.
>
> > SLQB has no such issue, because:
> > 1) SLQB has a percpu freelist. Free objects are put to the list firstly and can be picked up
> > later on quickly without lock. A batch parameter to control the free object recollection is mostly
> > 1024.
> > 2) SLQB slab order mostly is 0, so although sometimes it calls alloc_pages/free_pages, it can
> > benefit from zone_pcp(zone, cpu)->pcp page buffer.
> >
> > So SLUB need resolve such issues that one process allocates a batch of objects and another process
> > frees them batchly.
>
> SLUB has a percpu freelist but its bounded by the basic allocation unit.
> You can increase that by modifying the allocation order. Writing a 3 or 5
> into the order value in /sys/kernel/slab/xxx/order would do the trick.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists