linux-kernel - Re: [PATCH v3 4/5] slab: introduce byte sized index for the freelist of a slab

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 3 Dec 2013 11:25:39 +0900
From:	Joonsoo Kim <iamjoonsoo.kim@....com>
To:	Pekka Enberg <penberg@...nel.org>
Cc:	Christoph Lameter <cl@...ux.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	David Rientjes <rientjes@...gle.com>,
	Wanpeng Li <liwanp@...ux.vnet.ibm.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 4/5] slab: introduce byte sized index for the freelist
 of a slab

On Mon, Dec 02, 2013 at 05:49:42PM +0900, Joonsoo Kim wrote:
> Currently, the freelist of a slab consist of unsigned int sized indexes.
> Since most of slabs have less number of objects than 256, large sized
> indexes is needless. For example, consider the minimum kmalloc slab. It's
> object size is 32 byte and it would consist of one page, so 256 indexes
> through byte sized index are enough to contain all possible indexes.
> 
> There can be some slabs whose object size is 8 byte. We cannot handle
> this case with byte sized index, so we need to restrict minimum
> object size. Since these slabs are not major, wasted memory from these
> slabs would be negligible.
> 
> Some architectures' page size isn't 4096 bytes and rather larger than
> 4096 bytes (One example is 64KB page size on PPC or IA64) so that
> byte sized index doesn't fit to them. In this case, we will use
> two bytes sized index.
> 
> Below is some number for this patch.
> 
> * Before *
> kmalloc-512          525    640    512    8    1 : tunables   54   27    0 : slabdata     80     80      0
> kmalloc-256          210    210    256   15    1 : tunables  120   60    0 : slabdata     14     14      0
> kmalloc-192         1016   1040    192   20    1 : tunables  120   60    0 : slabdata     52     52      0
> kmalloc-96           560    620    128   31    1 : tunables  120   60    0 : slabdata     20     20      0
> kmalloc-64          2148   2280     64   60    1 : tunables  120   60    0 : slabdata     38     38      0
> kmalloc-128          647    682    128   31    1 : tunables  120   60    0 : slabdata     22     22      0
> kmalloc-32         11360  11413     32  113    1 : tunables  120   60    0 : slabdata    101    101      0
> kmem_cache           197    200    192   20    1 : tunables  120   60    0 : slabdata     10     10      0
> 
> * After *
> kmalloc-512          521    648    512    8    1 : tunables   54   27    0 : slabdata     81     81      0
> kmalloc-256          208    208    256   16    1 : tunables  120   60    0 : slabdata     13     13      0
> kmalloc-192         1029   1029    192   21    1 : tunables  120   60    0 : slabdata     49     49      0
> kmalloc-96           529    589    128   31    1 : tunables  120   60    0 : slabdata     19     19      0
> kmalloc-64          2142   2142     64   63    1 : tunables  120   60    0 : slabdata     34     34      0
> kmalloc-128          660    682    128   31    1 : tunables  120   60    0 : slabdata     22     22      0
> kmalloc-32         11716  11780     32  124    1 : tunables  120   60    0 : slabdata     95     95      0
> kmem_cache           197    210    192   21    1 : tunables  120   60    0 : slabdata     10     10      0
> 
> kmem_caches consisting of objects less than or equal to 256 byte have
> one or more objects than before. In the case of kmalloc-32, we have 11 more
> objects, so 352 bytes (11 * 32) are saved and this is roughly 9% saving of
> memory. Of couse, this percentage decreases as the number of objects
> in a slab decreases.
> 
> Here are the performance results on my 4 cpus machine.
> 
> * Before *
> 
>  Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs):
> 
>        229,945,138 cache-misses                                                  ( +-  0.23% )
> 
>       11.627897174 seconds time elapsed                                          ( +-  0.14% )
> 
> * After *
> 
>  Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs):
> 
>        218,640,472 cache-misses                                                  ( +-  0.42% )
> 
>       11.504999837 seconds time elapsed                                          ( +-  0.21% )
> 
> cache-misses are reduced by this patchset, roughly 5%.
> And elapsed times are improved by 1%.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@....com>
> 

Hello, Christoph.

Can I get your ACK for this patch?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/