[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091014154944.GD5027@csn.ul.ie>
Date: Wed, 14 Oct 2009 16:49:44 +0100
From: Mel Gorman <mel@....ul.ie>
To: Christoph Lameter <cl@...ux-foundation.org>
Cc: David Rientjes <rientjes@...gle.com>,
Pekka Enberg <penberg@...helsinki.fi>,
Tejun Heo <tj@...nel.org>, linux-kernel@...r.kernel.org,
Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
Zhang Yanmin <yanmin_zhang@...ux.intel.com>
Subject: Re: [this_cpu_xx V6 7/7] this_cpu: slub aggressive use of this_cpu
operations in the hotpaths
On Wed, Oct 14, 2009 at 10:08:12AM -0400, Christoph Lameter wrote:
> The test did not include the irqless patch I hope?
>
Correct. Only the patches in this thread were tested.
> On Wed, 14 Oct 2009, Mel Gorman wrote:
>
> > Small gains in the User, System and Elapsed times with this-cpu patches
> > applied. It is interest to note for the mean times that the patches more
> > than close the gap between SLUB and SLAB for the most part - the
> > exception being User which has marginally better performance. This might
> > indicate that SLAB is still slightly better at giving back cache-hot
> > memory but this is speculation.
>
> The queuing in SLAB allows a better cache hot behavior. Without a queue
> SLUB has a difficult time improvising cache hot behavior based on objects
> restricted to a slab page. Therefore the size of the slab page will
> affect how much "queueing" SLUB can do.
>
Ok, so the speculation is a plausible explanation.
> > The patches mostly improve the performance of netperf UDP_STREAM by a good
> > whack so the patches are a plus here. However, it should also be noted that
> > SLAB was mostly faster than SLUB, particularly for large packet sizes. Refresh
> > my memory, how do SLUB and SLAB differ in regards to off-loading large
> > allocations to the page allocator these days?
>
> SLUB offloads allocations > 8k to the page allocator.
> SLAB does create large slabs.
>
Allocations >8k might explain then why 8K and 16K packets for UDP_STREAM
performance suffers. That can be marked as future possible work to sort
out within the allocator.
However, does it explain why TCP_STREAM suffers so badly even for packet
sizes like 2K? It's also important to note in some cases, SLAB was far
slower even when the packet sizes were greater than 8k so I don't think
the page allocator is an adequate explanation for TCP_STREAM.
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists