linux-kernel - Re: [patch 1/3] slub: add per-cache slab thrash ratio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.0903301329290.21074@chino.kir.corp.google.com>
Date:	Mon, 30 Mar 2009 13:38:24 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Mel Gorman <mel@....ul.ie>
cc:	Pekka Enberg <penberg@...helsinki.fi>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Martin Bligh <mbligh@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: [patch 1/3] slub: add per-cache slab thrash ratio

On Mon, 30 Mar 2009, Mel Gorman wrote:

> netperf and tbench will both pound the sl*b allocator far more than sysbench
> will in my opinion although I don't have figures on-hand to back that up. In
> the case of netperf, it might be particular obvious if the client is on one
> CPU and the server on another because I believe that means all allocs happen
> on one CPU and all frees on another.
> 

My results are for two 16-core 64G machines on the same rack, one running 
netserver and the other running netperf.

> I have a vague concern that such a tunable needs to exist at all though
> and wonder what workloads it can hurt when set to 20 for example versus any
> other value.
> 

The tunable needs to exist unless a counter proposal is made that fixes 
this slub performance degradation compared to using slab.  I'd be very 
interested to hear other proposals on how to detect and remedy such 
situations in the allocator without the addition of a tunable.

As I mentioned previously in response to Pekka, it won't cause a further 
regression if sane SLAB_THRASHING_THRESHOLD and slab_thrash_ratio values 
are chosen.  The rules are pretty simple as described by the 
implementation: if a cpu slab can only allocate 20% of its objects three 
times in a row, we're going to choose a more free slab for the partial 
list while holding list_lock as opposed to constantly contending on it.  
This is particularly important for the netperf benchmark because the only 
cpu slabs that thrash are the ones with NUMA locality to the cpu taking 
the networking interrupt (because remote_node_defrag_ratio was unchanged 
from its default, meaning we avoid remote node defragmentation 98% of the 
time).

I haven't measured the fastpath implications of non-thrashing caches (the 
increment in the alloc fastpath and the conditional in the alloc slowpath 
for partial list sorting) yet, but your suggested experiments should show 
that quite well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/