[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.0903310043060.5579@chino.kir.corp.google.com>
Date: Tue, 31 Mar 2009 01:23:48 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Pekka Enberg <penberg@...helsinki.fi>
cc: Christoph Lameter <cl@...ux.com>,
Nick Piggin <nickpiggin@...oo.com.au>,
Martin Bligh <mbligh@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: [patch 2/3] slub: scan partial list for free slabs when
thrashing
On Tue, 31 Mar 2009, Pekka Enberg wrote:
> On Mon, 2009-03-30 at 10:37 -0400, Christoph Lameter wrote:
> > That adds fastpath overhead and it shows for small objects in your tests.
>
> Yup, and looking at this:
>
> + u16 fastpath_allocs; /* Consecutive fast allocs before slowpath */
> + u16 slowpath_allocs; /* Consecutive slow allocs before watermark */
>
> How much do operations on u16 hurt on, say, x86-64?
As opposed to unsigned int? These simply use the word variations of the
mov, test, cmp, and inc instructions instead of long. It's the same
tradeoff when using the u16 slub fields within struct page except it's not
strictly required in this instance because of size limitations, but rather
for cacheline optimization.
> It's nice that
> sizeof(struct kmem_cache_cpu) is capped at 32 bytes but on CPUs that
> have bigger cache lines, the types could be wider.
>
Right, this would not change the unpacked size of the struct whereas using
unsigned int would.
Since MAX_OBJS_PER_PAGE (which should really be renamed MAX_OBJS_PER_SLAB)
ensures there is no overflow for u16 types, the only time fastpath_allocs
would need to be wider is when the object size is sufficiently small and
there had been frees to the cpu slab so that it overflows. In this
circumstance, slowpath_allocs would simply be incremented and it would be
corrected the next time a cpu slab does allocate beyond the threshold
(SLAB_THRASHING_THRESHOLD should never be 1). The chance of reaching the
threshold on successive fastpath counter overflows grows exponentially.
And since slowpath_allocs will never overflow because it's capped at
SLAB_THRASHING_THRESHOLD + 1 (the cpu slab will be refilled with a slab
that will ensure slowpath_allocs will be decremented the next time the
slowpath is invoked), overflow isn't an immediate problem with either.
> Christoph, why is struct kmem_cache_cpu not __cacheline_aligned_in_smp
> btw?
>
This was removed in 4c93c355d5d563f300df7e61ef753d7a064411e9.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists