linux-kernel - Re: [GIT PULL] Lockless SLUB slowpaths for v3.1-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.1107311426001.944@chino.kir.corp.google.com>
Date:	Sun, 31 Jul 2011 14:55:20 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Pekka Enberg <penberg@...nel.org>
cc:	Christoph Lameter <cl@...ux.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>, hughd@...gle.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [GIT PULL] Lockless SLUB slowpaths for v3.1-rc1

On Sun, 31 Jul 2011, Pekka Enberg wrote:

> > And although slub is definitely heading in the right direction regarding 
> > the netperf benchmark, it's still a non-starter for anybody using large 
> > NUMA machines for networking performance.  On my 16-core, 4 node, 64GB 
> > client/server machines running netperf TCP_RR with various thread counts 
> > for 60 seconds each on 3.0:
> > 
> > 	threads		SLUB		SLAB		diff
> > 	 16		76345		74973		- 1.8%
> > 	 32		116380		116272		- 0.1%
> > 	 48		150509		153703		+ 2.1%
> > 	 64		187984		189750		+ 0.9%
> > 	 80		216853		224471		+ 3.5%
> > 	 96		236640		249184		+ 5.3%
> > 	112		256540		275464		+ 7.4%
> > 	128		273027		296014		+ 8.4%
> > 	144		281441		314791		+11.8%
> > 	160		287225		326941		+13.8%
> 
> That looks like a pretty nasty scaling issue. David, would it be
> possible to see 'perf report' for the 160 case? [ Maybe even 'perf
> annotate' for the interesting SLUB functions. ]
> 

More interesting than the perf report (which just shows kfree, 
kmem_cache_free, kmem_cache_alloc dominating) is the statistics that are 
exported by slub itself, it shows the "slab thrashing" issue that I 
described several times over the past few years.  It's difficult to 
address because it's a result of slub's design.  From the client side of 
160 netperf TCP_RR threads for 60 seconds:

	cache		alloc_fastpath		alloc_slowpath
	kmalloc-256	10937512 (62.8%)	6490753
	kmalloc-1024	17121172 (98.3%)	303547
	kmalloc-4096	5526281			11910454 (68.3%)

	cache		free_fastpath		free_slowpath
	kmalloc-256	15469			17412798 (99.9%)
	kmalloc-1024	11604742 (66.6%)	5819973
	kmalloc-4096	14848			17421902 (99.9%)

With those stats, there's no way that slub will even be able to compete 
with slab because it's not optimized for the slowpath.  There are ways to 
mitigate that, like with my slab thrashing patchset from a couple years 
ago that you tracked for a while that improved performance 3-4% at the 
overhead of an increment in the fastpath, but everything else requires 
more memory.  You could preallocate the slabs on the partial list, 
increase the per-node min_partial, increase the order of the slabs 
themselves so you hit the free fastpath much more often, etc, but they all 
come at a considerable cost in memory.

I'm very confident that slub could beat slab on any system if you throw 
enough memory at it because its fastpaths are extremely efficient, but 
there's no business case for that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/