lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 28 Sep 2015 14:26:39 +0200
From:	Jesper Dangaard Brouer <brouer@...hat.com>
To:	linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>
Cc:	netdev@...r.kernel.org, Jesper Dangaard Brouer <brouer@...hat.com>,
	Alexander Duyck <alexander.duyck@...il.com>,
	Pekka Enberg <penberg@...nel.org>,
	David Rientjes <rientjes@...gle.com>,
	Christoph Lameter <cl@...ux.com>,
	Joonsoo Kim <iamjoonsoo.kim@....com>
Subject: [PATCH 7/7] slub: do prefetching in kmem_cache_alloc_bulk()

For practical use-cases it is beneficial to prefetch the next freelist
object in bulk allocation loop.

Micro benchmarking show approx 1 cycle change:

bulk -  prev-patch     -  this patch
   1 -  49 cycles(tsc) - 49 cycles(tsc) - increase in cycles:0
   2 -  30 cycles(tsc) - 31 cycles(tsc) - increase in cycles:1
   3 -  23 cycles(tsc) - 25 cycles(tsc) - increase in cycles:2
   4 -  20 cycles(tsc) - 22 cycles(tsc) - increase in cycles:2
   8 -  18 cycles(tsc) - 19 cycles(tsc) - increase in cycles:1
  16 -  17 cycles(tsc) - 18 cycles(tsc) - increase in cycles:1
  30 -  18 cycles(tsc) - 17 cycles(tsc) - increase in cycles:-1
  32 -  18 cycles(tsc) - 19 cycles(tsc) - increase in cycles:1
  34 -  23 cycles(tsc) - 24 cycles(tsc) - increase in cycles:1
  48 -  21 cycles(tsc) - 22 cycles(tsc) - increase in cycles:1
  64 -  20 cycles(tsc) - 21 cycles(tsc) - increase in cycles:1
 128 -  27 cycles(tsc) - 27 cycles(tsc) - increase in cycles:0
 158 -  30 cycles(tsc) - 30 cycles(tsc) - increase in cycles:0
 250 -  37 cycles(tsc) - 37 cycles(tsc) - increase in cycles:0

Note, benchmark done with slab_nomerge to keep it stable enough
for accurate comparison.

Signed-off-by: Jesper Dangaard Brouer <brouer@...hat.com>
---
 mm/slub.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/slub.c b/mm/slub.c
index c25717ab3b5a..5af75a618b91 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2951,6 +2951,7 @@ bool kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 				goto error;
 
 			c = this_cpu_ptr(s->cpu_slab);
+			prefetch_freepointer(s, c->freelist);
 			continue; /* goto for-loop */
 		}
 
@@ -2960,6 +2961,7 @@ bool kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 			goto error;
 
 		c->freelist = get_freepointer(s, object);
+		prefetch_freepointer(s, c->freelist);
 		p[i] = object;
 
 		/* kmem_cache debug support */

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists