linux-kernel - Re: [PATCH 1/3] slub: set a criteria for slub node partial adding

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.1112062319010.21785@chino.kir.corp.google.com>
Date:	Tue, 6 Dec 2011 23:28:27 -0800 (PST)
From:	David Rientjes <rientjes@...gle.com>
To:	Shaohua Li <shaohua.li@...el.com>
cc:	"Shi, Alex" <alex.shi@...el.com>, Christoph Lameter <cl@...ux.com>,
	"penberg@...nel.org" <penberg@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH 1/3] slub: set a criteria for slub node partial adding

On Wed, 7 Dec 2011, Shaohua Li wrote:

> interesting. I did similar experiment before (try to sort the page
> according to free number), but it appears quite hard. The free number of
> a page is dynamic, eg more slabs can be freed when the page is in
> partial list. And in netperf test, the partial list could be very very
> long. Can you post your patch, I definitely what to look at it.

It was over a couple of years ago and the slub code has changed 
significantly since then, but you can see the general concept of the "slab 
thrashing" problem with netperf and my solution back then:

	http://marc.info/?l=linux-kernel&m=123839191416478
	http://marc.info/?l=linux-kernel&m=123839203016592
	http://marc.info/?l=linux-kernel&m=123839202916583

I also had a separate patchset that, instead of this approach, would just 
iterate through the partial list in get_partial_node() looking for 
anything where the number of free objects met a certain threshold, which 
still defaulted to 25% and instantly picked it.  The overhead was taking 
slab_lock() for each page, but that was nullified by the performance 
speedup of using the alloc fastpath a majority of the time for both 
kmalloc-256 and kmalloc-2k when in the past it had only been able to serve 
one or two allocs.  If no partial slab met the threshold, the slab_lock() 
is held of the partial slab with the most free objects and returned 
instead.

> What I have about the partial list is it wastes a lot of memory.

That's not going to be helped with the above approach since we typically 
try to fill a partial slab with many free objects, but it also won't be 
severely impacted because if the threshold is kept small enough, then we 
simply return the first partial slab that meets the criteria.  That allows 
the partial slabs at the end of the list to hopefully become mostly free.

And, for completeness, there's also a possibility that you have some 
completely free slabs on the partial list that coule be freed back to the 
buddy allocator by decreasing min_partial by way of 
/sys/kernel/slab/cache/min_partial at the risk of performance and then 
invoke /sys/kernel/slab/cache/shrink to free the unused slabs.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/