linux-kernel - Re: [patch 21/21] slab defrag: Obsolete SLAB

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0805141405350.19073@schroedinger.engr.sgi.com>
Date:	Wed, 14 May 2008 14:26:37 -0700 (PDT)
From:	Christoph Lameter <clameter@....com>
To:	Matt Mackall <mpm@...enic.com>
cc:	Andi Kleen <andi@...stfloor.org>,
	Pekka Enberg <penberg@...helsinki.fi>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Rik van Riel <riel@...hat.com>, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	Mel Gorman <mel@...net.ie>, Matthew Wilcox <matthew@....cx>,
	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
Subject: Re: [patch 21/21] slab defrag: Obsolete SLAB

On Wed, 14 May 2008, Matt Mackall wrote:

> First, we should obviously always expire all queues when we hit low
> water marks as it'll be cheaper/faster than other forms of reclaim.

Hmmm... I tried a scheme like that awhile back but it did not improve 
performance. The cost of queuing the object degraded the fast path (Note 
SLUB object queuing is fundamentally different due to no in slab 
metadata structure).

> Second, if our queues were per-slab (this might be hard, I realize), we
> can sweep the queue at alloc time.

In that case we dirty the same cacheline that we also need to take the 
page lock. Wonder if there would be any difference? The freelist is 
essentially a kind of per page queue (As pointed out by Ingo in the past).

> We can also sweep before falling back to the page allocator. That should
> guarantee that delayed frees don't negatively impact fragmentation.

That would introduce additional complexity for the NUMA case because now 
we would need to distinguish between the nodes that these objects came 
from. So we would have to scan the queue and classify the objects? Or 
determine the object node when queueing them and put them into an remote 
node queue? Sounds similar to all the trouble that we ended up with 
in SLAB.

> And lastly, we can always have a periodic thread/timer/workqueue
> operation.

I have had enough trouble in the last years with the 2 second hiccups that 
come with SLAB and that affect timing sensitive operations between 
processors in a SMP configuration and also cause trouble for applications 
that require fast network latencies. I'd rather avoid that.

> So far this is a bunch of hand-waving but I think this ends up basically
> being an anti-magazine. A magazine puts a per-cpu queue on the alloc
> side which costs on both the alloc and free side, regardless of whether
> the workload demands it. This puts a per-cpu queue on the free side that
> we can bypass in the cache-friendly case. I think that's a step in the
> right direction.

I think if you want queues for an SMP only system, do not care too much 
about memory use, dont do any frequent allocations on multicore systems 
and can tolerate the hiccups because your application does not care (most 
enterprise apps are constructed that way) or if you are running benchmarks 
that only access a limited dataset that fits into SLABs queues amd 
avoid touch the contenst of objects then the SLAB concept is the right way 
to go.

If we would strip the NUMA stuff out and make it an SMP only allocator for 
enterprise apps then the code may become much smaller and simpler. I guess 
Arjan suggested something similar in the past. But that would result in 
SLAB no longer being a general allocator.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/