[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070821204145.GA23795@Krystal>
Date: Tue, 21 Aug 2007 16:41:45 -0400
From: Mathieu Desnoyers <compudj@...stal.dyndns.org>
To: Christoph Lameter <clameter@....com>
Cc: akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
mingo@...hat.com
Subject: Re: [PATCH] SLUB use cmpxchg_local
* Mathieu Desnoyers (mathieu.desnoyers@...ymtl.ca) wrote:
> Ok, I played with your patch a bit, and the results are quite
> interesting:
>
...
> Summary:
>
> (tests repeated 10000 times on a 3GHz Pentium 4)
> (kernel DEBUG menuconfig options are turned off)
> results are in cycles per iteration
> I did 2 runs of the slab.git HEAD to have an idea of errors associated
> to the measurements:
>
> | slab.git HEAD slub (min-max) | cmpxchg_local slub
> kmalloc(8) | 190 - 201 | 83
> kfree(8) | 351 - 351 | 363
> kmalloc(64) | 224 - 245 | 115
> kfree(64) | 389 - 394 | 397
> kmalloc(16384)| 713 - 741 | 724
> kfree(16384) | 843 - 856 | 843
>
> Therefore, there seems to be a repeatable gain on the kmalloc fast path
> (more than twice faster). No significant performance hit for the kfree
> case, but no gain neither, same for large kmalloc, as expected.
>
Having no performance improvement for kfree seems a little weird, since
we are moving from irq disable to cmpxchg_local in the fast path. A
possible explanation would be that we are always hitting the slow path.
I did a simple test, counting the number of fast vs slow paths with my
cmpxchg_local slub version:
(initial state before the test)
[ 386.359364] Fast slub free: 654982
[ 386.369507] Slow slub free: 392195
[ 386.379660] SLUB Performance testing
[ 386.390361] ========================
[ 386.401020] 1. Kmalloc: Repeatedly allocate then free test
...
(after test 1)
[ 387.366002] Fast slub free: 657338 diff (2356)
[ 387.376158] Slow slub free: 482162 diff (89967)
[ 387.386294] 2. Kmalloc: alloc/free test
...
(after test 2)
[ 387.897816] Fast slub free: 748968 (diff 91630)
[ 387.907947] Slow slub free: 482584 diff (422)
Therefore, in the test where we have separate passes for slub allocation
and free, we hit mostly the slow path. Any particular reason for that ?
Note that the alloc/free test (test 2) seems to hit the fast path as
expected.
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists