[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0805141405350.19073@schroedinger.engr.sgi.com>
Date: Wed, 14 May 2008 14:26:37 -0700 (PDT)
From: Christoph Lameter <clameter@....com>
To: Matt Mackall <mpm@...enic.com>
cc: Andi Kleen <andi@...stfloor.org>,
Pekka Enberg <penberg@...helsinki.fi>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Rik van Riel <riel@...hat.com>, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
Mel Gorman <mel@...net.ie>, Matthew Wilcox <matthew@....cx>,
"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>
Subject: Re: [patch 21/21] slab defrag: Obsolete SLAB
On Wed, 14 May 2008, Matt Mackall wrote:
> First, we should obviously always expire all queues when we hit low
> water marks as it'll be cheaper/faster than other forms of reclaim.
Hmmm... I tried a scheme like that awhile back but it did not improve
performance. The cost of queuing the object degraded the fast path (Note
SLUB object queuing is fundamentally different due to no in slab
metadata structure).
> Second, if our queues were per-slab (this might be hard, I realize), we
> can sweep the queue at alloc time.
In that case we dirty the same cacheline that we also need to take the
page lock. Wonder if there would be any difference? The freelist is
essentially a kind of per page queue (As pointed out by Ingo in the past).
> We can also sweep before falling back to the page allocator. That should
> guarantee that delayed frees don't negatively impact fragmentation.
That would introduce additional complexity for the NUMA case because now
we would need to distinguish between the nodes that these objects came
from. So we would have to scan the queue and classify the objects? Or
determine the object node when queueing them and put them into an remote
node queue? Sounds similar to all the trouble that we ended up with
in SLAB.
> And lastly, we can always have a periodic thread/timer/workqueue
> operation.
I have had enough trouble in the last years with the 2 second hiccups that
come with SLAB and that affect timing sensitive operations between
processors in a SMP configuration and also cause trouble for applications
that require fast network latencies. I'd rather avoid that.
> So far this is a bunch of hand-waving but I think this ends up basically
> being an anti-magazine. A magazine puts a per-cpu queue on the alloc
> side which costs on both the alloc and free side, regardless of whether
> the workload demands it. This puts a per-cpu queue on the free side that
> we can bypass in the cache-friendly case. I think that's a step in the
> right direction.
I think if you want queues for an SMP only system, do not care too much
about memory use, dont do any frequent allocations on multicore systems
and can tolerate the hiccups because your application does not care (most
enterprise apps are constructed that way) or if you are running benchmarks
that only access a limited dataset that fits into SLABs queues amd
avoid touch the contenst of objects then the SLAB concept is the right way
to go.
If we would strip the NUMA stuff out and make it an SMP only allocator for
enterprise apps then the code may become much smaller and simpler. I guess
Arjan suggested something similar in the past. But that would result in
SLAB no longer being a general allocator.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists