[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.0904071231500.21113@chino.kir.corp.google.com>
Date: Tue, 7 Apr 2009 12:44:13 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Pekka Enberg <penberg@...helsinki.fi>
cc: Christoph Lameter <cl@...ux-foundation.org>,
linux-kernel@...r.kernel.org
Subject: Re: [patch] slub: default min_partial to at least highest cpus per
node
On Tue, 7 Apr 2009, Pekka Enberg wrote:
> > Hmm, partial lists are per-node, so wouldn't it be better to do the
> > adjustment for every struct kmem_cache_node separately? The
> > 'min_partial_per_node' global seems just too ugly and confusing to live
> > with.
>
> Btw, that requires moving ->min_partial to struct kmem_cache_node from
> struct kmem_cache. But I think that makes a whole lot of sense if
> some nodes may have more CPUs than others.
>
> And while the improvement is kinda obvious, I would be interested to
> know what kind of workload benefits from this patch (and see numbers
> if there are any).
>
It doesn't really depend on the workload, it depends on the type of NUMA
machine it's running on (and whether that NUMA is asymmetric amongst
cpus).
Since min_partial_per_node is capped at MAX_PARTIAL, this is only really
relevant for remote node defragmentation if it's allowed (and not just 2%
of the time like the default). We want to avoid stealing partial slabs
from remote nodes if there are fewer than the number of cpus on that node.
Otherwise, it's possible for each cpu on the victim node to try to
allocate a single object and require nr_cpus_node(node) new slab
allocations. In this case it's entirely possible for the majority of cpus
to have cpu slabs from remote nodes. This change reduces the liklihood of
that happening because we'll always have cpu slab replacements on our
local partial list before allowing remote defragmentation.
I'd be just as happy with the following, although it would require
changing MIN_PARTIAL to be greater than its default of 5 if a node
supports more cpus for optimal performance (the old patch did that
automatically up to MAX_PARTIAL).
---
diff --git a/mm/slub.c b/mm/slub.c
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1326,11 +1326,13 @@ static struct page *get_any_partial(struct kmem_cache *s, gfp_t flags)
zonelist = node_zonelist(slab_node(current->mempolicy), flags);
for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) {
struct kmem_cache_node *n;
+ int node;
- n = get_node(s, zone_to_nid(zone));
+ node = zone_to_nid(zone);
+ n = get_node(s, node);
if (n && cpuset_zone_allowed_hardwall(zone, flags) &&
- n->nr_partial > s->min_partial) {
+ n->nr_partial > nr_cpus_node(node)) {
page = get_partial_node(n);
if (page)
return page;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists