[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6E3BC7F7C9A4BF4286DD4C043110F30B67236EED18@shsmsx502.ccr.corp.intel.com>
Date: Fri, 9 Dec 2011 21:40:39 +0800
From: "Shi, Alex" <alex.shi@...el.com>
To: David Rientjes <rientjes@...gle.com>
CC: Christoph Lameter <cl@...ux.com>,
"penberg@...nel.org" <penberg@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Eric Dumazet <eric.dumazet@...il.com>
Subject: RE: [PATCH 1/3] slub: set a criteria for slub node partial adding
> > I did some experiments on add_partial judgment against rc4, like to put
> > the slub into node partial head or tail according to free objects, or
> > like Eric's suggest to combine the external parameter, like below:
> >
> > n->nr_partial++;
> > - if (tail == DEACTIVATE_TO_TAIL)
> > + if (tail == DEACTIVATE_TO_TAIL ||
> > + page->inuse > page->objects /2)
> > list_add_tail(&page->lru, &n->partial);
> > else
> > list_add(&page->lru, &n->partial);
> >
> > But the result is out of my expectation before.
>
> I don't think you'll get consistent results for all workloads with
> something like this, some things may appear better and other things may
> appear worse. That's why I've always disagreed with determining whether
> it should be added to the head or to the tail at the time of deactivation:
> you know nothing about frees happening to that slab subsequent to the
> decision you've made. The only thing that's guaranteed is that you've
> through cache hot objects out the window and potentially increased the
> amount of internally fragmented slabs and/or unnecessarily long partial
> lists.
I said it not my original expectation doesn't mean my data has problem. :)
Of course any testing may have result variation. But it is benchmark accordingly, and there are lot technical to tuning your testing to make its stand division acceptable, like to sync your system in a clear status, to close unnecessary services, to use separate working disks for your testing etc. etc. For this data, like on my SNB-EP machine, (the following data is not stands for Intel, it is just my personal data).
4 times result of hackbench on this patch are 5.59, 5.475, 5.47833, 5.504
And more results on original rc4 are from 5.54 to 5.61, the stand division of results show is stable and believable on my side. But since in our handreds benchmarks, only hackbench and loopback netperf is sensitive with slub change, and since it seems you did some testing on this. I thought you may like to do a double confirm with real data.
In fact, I also collected the 'perf stat' for cache missing or reference data, but that wave too much not stabled like hackbench itself.
> Not sure what you're asking me to test, you would like this:
>
> {
> n->nr_partial++;
> - if (tail == DEACTIVATE_TO_TAIL)
> - list_add_tail(&page->lru, &n->partial);
> - else
> - list_add(&page->lru, &n->partial);
> + list_add_tail(&page->lru, &n->partial);
> }
>
> with the statistics patch above? I typically run with CONFIG_SLUB_STATS
> disabled since it impacts performance so heavily and I'm not sure what
> information you're looking for with regards to those stats.
NO, when you collect data, please close SLUB_STAT in kernel config. _to_head statistics collection patch just tell you, I collected the statistics not include add_partial in early_kmem_cache_node_alloc(). And other places of add_partial were covered. Of course, the kernel with statistic can not be used to measure performance.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists