linux-kernel - Re: [PATCH] SLUB use cmpxchg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070821204145.GA23795@Krystal>
Date:	Tue, 21 Aug 2007 16:41:45 -0400
From:	Mathieu Desnoyers <compudj@...stal.dyndns.org>
To:	Christoph Lameter <clameter@....com>
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	mingo@...hat.com
Subject: Re: [PATCH] SLUB use cmpxchg_local

* Mathieu Desnoyers (mathieu.desnoyers@...ymtl.ca) wrote:
> Ok, I played with your patch a bit, and the results are quite
> interesting:
> 
...
> Summary:
> 
> (tests repeated 10000 times on a 3GHz Pentium 4)
> (kernel DEBUG menuconfig options are turned off)
> results are in cycles per iteration
> I did 2 runs of the slab.git HEAD to have an idea of errors associated
> to the measurements:
> 
>              |     slab.git HEAD slub (min-max)    |  cmpxchg_local slub
> kmalloc(8)   |         190 - 201                   |         83
> kfree(8)     |         351 - 351                   |        363
> kmalloc(64)  |         224 - 245                   |        115
> kfree(64)    |         389 - 394                   |        397
> kmalloc(16384)|        713 - 741                   |        724
> kfree(16384) |         843 - 856                   |        843
> 
> Therefore, there seems to be a repeatable gain on the kmalloc fast path
> (more than twice faster). No significant performance hit for the kfree
> case, but no gain neither, same for large kmalloc, as expected.
> 

Having no performance improvement for kfree seems a little weird, since
we are moving from irq disable to cmpxchg_local in the fast path. A
possible explanation would be that we are always hitting the slow path.
I did a simple test, counting the number of fast vs slow paths with my
cmpxchg_local slub version:

(initial state before the test)
[  386.359364] Fast slub free: 654982
[  386.369507] Slow slub free: 392195
[  386.379660] SLUB Performance testing
[  386.390361] ========================
[  386.401020] 1. Kmalloc: Repeatedly allocate then free test
...
(after test 1)
[  387.366002] Fast slub free: 657338 diff (2356)
[  387.376158] Slow slub free: 482162 diff (89967)

[  387.386294] 2. Kmalloc: alloc/free test
...
(after test 2)
[  387.897816] Fast slub free: 748968 (diff 91630)
[  387.907947] Slow slub free: 482584 diff (422)

Therefore, in the test where we have separate passes for slub allocation
and free, we hit mostly the slow path. Any particular reason for that ?

Note that the alloc/free test (test 2) seems to hit the fast path as
expected.

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/