linux-kernel - Re: [patch] slub: default min_partial to at least highest cpus per node

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.0904071231500.21113@chino.kir.corp.google.com>
Date:	Tue, 7 Apr 2009 12:44:13 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Pekka Enberg <penberg@...helsinki.fi>
cc:	Christoph Lameter <cl@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [patch] slub: default min_partial to at least highest cpus per
  node

On Tue, 7 Apr 2009, Pekka Enberg wrote:

> > Hmm, partial lists are per-node, so wouldn't it be better to do the
> > adjustment for every struct kmem_cache_node separately? The
> > 'min_partial_per_node' global seems just too ugly and confusing to live
> > with.
> 
> Btw, that requires moving ->min_partial to struct kmem_cache_node from
> struct kmem_cache.  But I think that makes a whole lot of sense if
> some nodes may have more CPUs than others.
> 
> And while the improvement is kinda obvious, I would be interested to
> know what kind of workload benefits from this patch (and see numbers
> if there are any).
> 

It doesn't really depend on the workload, it depends on the type of NUMA 
machine it's running on (and whether that NUMA is asymmetric amongst 
cpus).

Since min_partial_per_node is capped at MAX_PARTIAL, this is only really 
relevant for remote node defragmentation if it's allowed (and not just 2% 
of the time like the default).  We want to avoid stealing partial slabs 
from remote nodes if there are fewer than the number of cpus on that node.

Otherwise, it's possible for each cpu on the victim node to try to 
allocate a single object and require nr_cpus_node(node) new slab 
allocations.  In this case it's entirely possible for the majority of cpus 
to have cpu slabs from remote nodes.  This change reduces the liklihood of 
that happening because we'll always have cpu slab replacements on our 
local partial list before allowing remote defragmentation.

I'd be just as happy with the following, although it would require 
changing MIN_PARTIAL to be greater than its default of 5 if a node 
supports more cpus for optimal performance (the old patch did that 
automatically up to MAX_PARTIAL).
---
diff --git a/mm/slub.c b/mm/slub.c
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1326,11 +1326,13 @@ static struct page *get_any_partial(struct kmem_cache *s, gfp_t flags)
 	zonelist = node_zonelist(slab_node(current->mempolicy), flags);
 	for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) {
 		struct kmem_cache_node *n;
+		int node;

-		n = get_node(s, zone_to_nid(zone));
+		node = zone_to_nid(zone);
+		n = get_node(s, node);

 		if (n && cpuset_zone_allowed_hardwall(zone, flags) &&
-				n->nr_partial > s->min_partial) {
+				n->nr_partial > nr_cpus_node(node)) {
 			page = get_partial_node(n);
 			if (page)
 				return page;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/