linux-kernel - Re: [PATCH 1/3] slub: set a criteria for slub node partial adding

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.1112090203370.12604@chino.kir.corp.google.com>
Date:	Fri, 9 Dec 2011 02:10:52 -0800 (PST)
From:	David Rientjes <rientjes@...gle.com>
To:	"Alex,Shi" <alex.shi@...el.com>
cc:	Christoph Lameter <cl@...ux.com>,
	"penberg@...nel.org" <penberg@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: [PATCH 1/3] slub: set a criteria for slub node partial adding

On Fri, 9 Dec 2011, Alex,Shi wrote:

> I did some experiments on add_partial judgment against rc4, like to put
> the slub into node partial head or tail according to free objects, or
> like Eric's suggest to combine the external parameter, like below: 
> 
>         n->nr_partial++;
> -       if (tail == DEACTIVATE_TO_TAIL)
> +       if (tail == DEACTIVATE_TO_TAIL || 
> +               page->inuse > page->objects /2)
>                 list_add_tail(&page->lru, &n->partial);
>         else
>                 list_add(&page->lru, &n->partial);
> 
> But the result is out of my expectation before.

I don't think you'll get consistent results for all workloads with 
something like this, some things may appear better and other things may 
appear worse.  That's why I've always disagreed with determining whether 
it should be added to the head or to the tail at the time of deactivation: 
you know nothing about frees happening to that slab subsequent to the 
decision you've made.  The only thing that's guaranteed is that you've 
through cache hot objects out the window and potentially increased the 
amount of internally fragmented slabs and/or unnecessarily long partial 
lists.

> Now we set all of slub
> into the tail of node partial, we get the best performance, even it is
> just a slight improvement. 
> 
> {
>         n->nr_partial++;
> -       if (tail == DEACTIVATE_TO_TAIL)
> -               list_add_tail(&page->lru, &n->partial);
> -       else
> -               list_add(&page->lru, &n->partial);
> +       list_add_tail(&page->lru, &n->partial);
>  }
>  
> This change can bring about 2% improvement on our WSM-ep machine, and 1%
> improvement on our SNB-ep and NHM-ex machine. and no clear effect for
> core2 machine. on hackbench process benchmark.
> 
>  	./hackbench 100 process 2000 
>  
> For multiple clients loopback netperf, only a suspicious 1% improvement
> on our 2 sockets machine. and others have no clear effect. 
> 
> But, when I check the deactivate_to_head/to_tail statistics on original
> code, the to_head is just hundreds or thousands times, while to_tail is
> called about teens millions times. 
> 
> David, could you like to try above change? move all slub to partial
> tail. 
> 
> add_partial statistics collection patch: 
> ---
> commit 1ff731282acb521f3a7c2e3fb94d35ec4d0ff07e
> Author: Alex Shi <alex.shi@...el.com>
> Date:   Fri Dec 9 18:12:14 2011 +0800
> 
>     slub: statistics collection for add_partial
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 5843846..a2b1143 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1904,10 +1904,11 @@ static void unfreeze_partials(struct kmem_cache *s)
>  			if (l != m) {
>  				if (l == M_PARTIAL)
>  					remove_partial(n, page);
> -				else
> +				else {
>  					add_partial(n, page,
>  						DEACTIVATE_TO_TAIL);
> -
> +					stat(s, DEACTIVATE_TO_TAIL);
> +				}
>  				l = m;
>  			}
>  
> @@ -2480,6 +2481,7 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
>  			remove_full(s, page);
>  			add_partial(n, page, DEACTIVATE_TO_TAIL);
>  			stat(s, FREE_ADD_PARTIAL);
> +			stat(s, DEACTIVATE_TO_TAIL);
>  		}
>  	}
>  	spin_unlock_irqrestore(&n->list_lock, flags);

Not sure what you're asking me to test, you would like this:

	{
	        n->nr_partial++;
	-       if (tail == DEACTIVATE_TO_TAIL)
	-               list_add_tail(&page->lru, &n->partial);
	-       else
	-               list_add(&page->lru, &n->partial);
	+       list_add_tail(&page->lru, &n->partial);
	}

with the statistics patch above?  I typically run with CONFIG_SLUB_STATS 
disabled since it impacts performance so heavily and I'm not sure what 
information you're looking for with regards to those stats.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/