lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 9 Jan 2014 09:13:31 +0900
From:	Joonsoo Kim <iamjoonsoo.kim@....com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Pekka Enberg <penberg@...nel.org>, Helge Deller <deller@....de>,
	Mikulas Patocka <mpatocka@...hat.com>,
	Andi Kleen <ak@...ux.intel.com>,
	Christoph Lameter <cl@...ux.com>,
	LKML <linux-kernel@...r.kernel.org>, linux-parisc@...r.kernel.org
Subject: Re: [PATCH] fix crash when using XFS on loopback

On Wed, Jan 08, 2014 at 01:59:30PM -0800, Andrew Morton wrote:
> On Wed, 8 Jan 2014 23:37:49 +0200 Pekka Enberg <penberg@...nel.org> wrote:
> 
> > The patch looks good to me but it probably should go through Andrew's tree.
> 
> yup.
> 
> page_mapping() will be called quite frequently, and adding a new
> test-n-branch in there will be somewhat costly.  We might end up with a
> better kernel if we were to instead revert 8456a648cf44f.  How useful
> was that patch?

Hello,

Performance effect of this patch was decribed in the cover-letter, but
I missed to attach it to patch description. Sorry about that.

In summary, this patch saves some memory and decreases cache-footprint
so that it increases performance.

Here goes the description in cover-letter.

Below is some numbers of 'cat /proc/slabinfo'.

* Before *
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables [snip...]
kmalloc-512          527    600    512    8    1 : tunables   54   27    0 : slabdata     75     75      0
kmalloc-256          210    210    256   15    1 : tunables  120   60    0 : slabdata     14     14      0
kmalloc-192         1040   1040    192   20    1 : tunables  120   60    0 : slabdata     52     52      0
kmalloc-96           750    750    128   30    1 : tunables  120   60    0 : slabdata     25     25      0
kmalloc-64          2773   2773     64   59    1 : tunables  120   60    0 : slabdata     47     47      0
kmalloc-128          660    690    128   30    1 : tunables  120   60    0 : slabdata     23     23      0
kmalloc-32         11200  11200     32  112    1 : tunables  120   60    0 : slabdata    100    100      0
kmem_cache           197    200    192   20    1 : tunables  120   60    0 : slabdata     10     10      0

* After *
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables [snip...]
kmalloc-512          525    640    512    8    1 : tunables   54   27    0 : slabdata     80     80      0
kmalloc-256          210    210    256   15    1 : tunables  120   60    0 : slabdata     14     14      0
kmalloc-192         1016   1040    192   20    1 : tunables  120   60    0 : slabdata     52     52      0
kmalloc-96           560    620    128   31    1 : tunables  120   60    0 : slabdata     20     20      0
kmalloc-64          2148   2280     64   60    1 : tunables  120   60    0 : slabdata     38     38      0
kmalloc-128          647    682    128   31    1 : tunables  120   60    0 : slabdata     22     22      0
kmalloc-32         11360  11413     32  113    1 : tunables  120   60    0 : slabdata    101    101      0
kmem_cache           197    200    192   20    1 : tunables  120   60    0 : slabdata     10     10      0

kmem_caches consisting of objects less than or equal to 128 byte have one more
objects in a slab. You can see it at objperslab.


Here are the performance results on my 4 cpus machine.

* Before *

 Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs):

       238,309,671 cache-misses                                                  ( +-  0.40% )

      12.010172090 seconds time elapsed                                          ( +-  0.21% )

* After *

 Performance counter stats for 'perf bench sched messaging -g 50 -l 1000' (10 runs):

       229,945,138 cache-misses                                                  ( +-  0.23% )

      11.627897174 seconds time elapsed                                          ( +-  0.14% )

cache-misses are reduced by this patchset, roughly 5%.
And elapsed times are also improved by 3.1% to baseline.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ