linux-kernel - Re: [patch] slub: default min_partial to at least highest cpus per node

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <49DBAF7E.30704@cs.helsinki.fi>
Date:	Tue, 07 Apr 2009 22:54:38 +0300
From:	Pekka Enberg <penberg@...helsinki.fi>
To:	David Rientjes <rientjes@...gle.com>
CC:	Christoph Lameter <cl@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [patch] slub: default min_partial to at least highest cpus per
  node

David Rientjes wrote:
> On Tue, 7 Apr 2009, Pekka Enberg wrote:
> 
>>> Hmm, partial lists are per-node, so wouldn't it be better to do the
>>> adjustment for every struct kmem_cache_node separately? The
>>> 'min_partial_per_node' global seems just too ugly and confusing to live
>>> with.
>> Btw, that requires moving ->min_partial to struct kmem_cache_node from
>> struct kmem_cache.  But I think that makes a whole lot of sense if
>> some nodes may have more CPUs than others.
>>
>> And while the improvement is kinda obvious, I would be interested to
>> know what kind of workload benefits from this patch (and see numbers
>> if there are any).
>>
> 
> It doesn't really depend on the workload, it depends on the type of NUMA 
> machine it's running on (and whether that NUMA is asymmetric amongst 
> cpus).
> 
> Since min_partial_per_node is capped at MAX_PARTIAL, this is only really 
> relevant for remote node defragmentation if it's allowed (and not just 2% 
> of the time like the default).  We want to avoid stealing partial slabs 
> from remote nodes if there are fewer than the number of cpus on that node.
> 
> Otherwise, it's possible for each cpu on the victim node to try to 
> allocate a single object and require nr_cpus_node(node) new slab 
> allocations.  In this case it's entirely possible for the majority of cpus 
> to have cpu slabs from remote nodes.  This change reduces the liklihood of 
> that happening because we'll always have cpu slab replacements on our 
> local partial list before allowing remote defragmentation.
> 
> I'd be just as happy with the following, although it would require 
> changing MIN_PARTIAL to be greater than its default of 5 if a node 
> supports more cpus for optimal performance (the old patch did that 
> automatically up to MAX_PARTIAL).

Hmm but why not move ->min_partial to struct kmem_cache_node as I 
suggested and make sure it's adjusted properly as with nr_cpus_node()?

> diff --git a/mm/slub.c b/mm/slub.c
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1326,11 +1326,13 @@ static struct page *get_any_partial(struct kmem_cache *s, gfp_t flags)
>  	zonelist = node_zonelist(slab_node(current->mempolicy), flags);
>  	for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) {
>  		struct kmem_cache_node *n;
> +		int node;
>  
> -		n = get_node(s, zone_to_nid(zone));
> +		node = zone_to_nid(zone);
> +		n = get_node(s, node);
>  
>  		if (n && cpuset_zone_allowed_hardwall(zone, flags) &&
> -				n->nr_partial > s->min_partial) {
> +				n->nr_partial > nr_cpus_node(node)) {
>  			page = get_partial_node(n);
>  			if (page)
>  				return page;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/