lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <482B2617.5010605@firstfloor.org>
Date:	Wed, 14 May 2008 19:49:11 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Christoph Lameter <clameter@....com>
CC:	Pekka Enberg <penberg@...helsinki.fi>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Rik van Riel <riel@...hat.com>, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	Mel Gorman <mel@...net.ie>, mpm@...enic.com,
	Matthew Wilcox <matthew@....cx>,
	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
Subject: Re: [patch 21/21] slab defrag: Obsolete SLAB

Christoph Lameter wrote:

> Fundamentally there is no way to avoid complex queueing on free() unless 
> one directly frees the object. This is serialized in SLUB by taking a page 
> lock. 

iirc profiling analysis showed that the problem was the page lock
serialization (in particular the slab_lock() in __slab_free). That
was on 2.6.24.2

> Howver, the "slow" case in SLUB is still much less complex 
> than comparable processing in SLAB. It is quite fast.

Well in the benchmark it is slower.


> SLAB freeing can avoid taking a lock if
> 
> 1. We can establish that the object is node local (trivial if !NUMA 
> otherwise we need to get the node information from the page struct and 
> compare to the current node).

Ignoring NUMA is no option unfortunately. And with integrated memory
controller many of the remote CPU frees are off node.

> The main issue for SLAB vs. SLUB on free is likely the !NUMA case in which 
> SLAB can avoid the overhead of the node check (which does not exist in 
> SLUB) and in which case we can always immediately batch the object (if 
> there is space). The additional overhead in SLUB is mainly one 
> atomic instruction over the SLAB fastpath.

I think the problem is that this atomic operation thrashes cache lines
around. Really counting cycles on instructions is not that interesting,
but minimizing the cache thrashing is. And for that it looks like slub
is worse.

> So I think that the free need to  stay as is. The disadvantages in terms 
> of the complexity of handling the objects and expiring them and the issue 
> of having to take per node locks in SLAB makes it hard to justify adding a 
> queue for free in SLUB. Maybe someone has an inspiration on how to do this 
> effective that is better than my attempts which always ultimately ended 
> implementing code that thad the same issues that we have in SLAB.

What is the big problem of having a batched free queue? If the expiry
is done at a good bounded time (e.g. on interrupt exit or similar)
locally on the CPU it shouldn't be a big issue, should it?

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ