lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0710180203500.13576@schroedinger.engr.sgi.com>
Date:	Thu, 18 Oct 2007 02:13:34 -0700 (PDT)
From:	Christoph Lameter <clameter@....com>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	Yasunori Goto <y-goto@...fujitsu.com>,
	Linux Kernel ML <linux-kernel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>
Subject: Re: [Patch](memory hotplug) Make kmem_cache_node for SLUB on memory
 online to avoid panic(take 3)

On Thu, 18 Oct 2007, Andrew Morton wrote:

> > Slab brings up a per node structure when the corresponding cpu is brought 
> > up. That was sufficient as long as we did not have any memoryless nodes. 
> > Now we may have to fix some things over there as well.
> 
> Is there amy point?  Our time would be better spent in making
> slab.c go away.  How close are we to being able to do that anwyay?

Well the problem right now is the regression in slab_free() on SMP. 
AFAICT UP and NUMA is fine and also most loads under SMP. Concurrent 
allocation / frees on multiple processors are several times faster (I see 
up to 10 fold improvements on an 8p).

However, long sequences of free operations from a single processor under 
SMP require too many atomic operations compared with SLAB. If I only do 
frees on a single processor on SMP then I can produce a 30% regression for 
slabs between 128 and 1024 byte in size. I have a patchset in the works 
that reduces the atomic operations for those.

SLAB currently has an advantage since it uses coarser grained locking. 
SLAB can take a global lock and then perform queue operations on 
multiple objects. SLUB has fine grained locking which increases 
concurrency but also the overhead of atomic operations.

The regression does not surface under UP since we do not need to do 
locking. And it does not surface under NUMA since the alien cache stuff in 
SLAB is reducing slab_free performance compared to SMP.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ