[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <50AC4912.7040503@redhat.com>
Date: Tue, 20 Nov 2012 22:22:58 -0500
From: Rik van Riel <riel@...hat.com>
To: habanero@...ux.vnet.ibm.com
CC: Ingo Molnar <mingo@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
David Rientjes <rientjes@...gle.com>,
Mel Gorman <mgorman@...e.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Paul Turner <pjt@...gle.com>,
Lee Schermerhorn <Lee.Schermerhorn@...com>,
Christoph Lameter <cl@...ux.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Johannes Weiner <hannes@...xchg.org>,
Hugh Dickins <hughd@...gle.com>
Subject: Re: numa/core regressions fixed - more testers wanted
On 11/20/2012 08:54 PM, Andrew Theurer wrote:
> I can confirm single JVM JBB is working well for me. I see a 30%
> improvement over autoNUMA. What I can't make sense of is some perf
> stats (taken at 80 warehouses on 4 x WST-EX, 512GB memory):
AutoNUMA does not have native THP migration, that may explain some
of the difference.
> tips numa/core:
>
> 5,429,632,865 node-loads
> 3,806,419,082 node-load-misses(70.1%)
> 2,486,756,884 node-stores
> 2,042,557,277 node-store-misses(82.1%)
> 2,878,655,372 node-prefetches
> 2,201,441,900 node-prefetch-misses
>
> autoNUMA:
>
> 4,538,975,144 node-loads
> 2,666,374,830 node-load-misses(58.7%)
> 2,148,950,354 node-stores
> 1,682,942,931 node-store-misses(78.3%)
> 2,191,139,475 node-prefetches
> 1,633,752,109 node-prefetch-misses
>
> The percentage of misses is higher for numa/core. I would have expected
> the performance increase be due to lower "node-misses", but perhaps I am
> misinterpreting the perf data.
Lack of native THP migration may be enough to explain the
performance difference, despite autonuma having better node
locality.
>> Next I'll work on making multi-JVM more of an improvement, and
>> I'll also address any incoming regression reports.
>
> I have issues with multiple KVM VMs running either JBB or
> dbench-in-tmpfs, and I suspect whatever I am seeing is similar to
> whatever multi-jvm in baremetal is. What I typically see is no real
> convergence of a single node for resource usage for any of the VMs. For
> example, when running 8 VMs, 10 vCPUs each, a VM may have the following
> resource usage:
This is an issue. I have tried understanding the new local/shared
and shared task grouping code, but have not wrapped my mind around
that code yet.
I will have to look at that code a few more times, and ask more
questions of Ingo and Peter (and maybe ask some of the same questions
again - I see that some of my comments were addressed in the next
version of the patch, but the email never got a reply).
--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists