[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20120403203500.GA14386@linux.vnet.ibm.com>
Date: Wed, 4 Apr 2012 02:05:00 +0530
From: Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>
To: Andrea Arcangeli <aarcange@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Hillf Danton <dhillf@...il.com>, Dan Smith <danms@...ibm.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>, Paul Turner <pjt@...gle.com>,
Suresh Siddha <suresh.b.siddha@...el.com>,
Mike Galbraith <efault@....de>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Lai Jiangshan <laijs@...fujitsu.com>,
Bharata B Rao <bharata.rao@...il.com>,
Lee Schermerhorn <Lee.Schermerhorn@...com>,
Rik van Riel <riel@...hat.com>,
Johannes Weiner <hannes@...xchg.org>
Subject: Re: [PATCH 00/39] [RFC] AutoNUMA alpha10
* Andrea Arcangeli <aarcange@...hat.com> [2012-03-26 19:45:47]:
> This is the result of the first round of cleanups of the AutoNUMA patch.
I happened to test numasched and autonuma against a Java benchmark and
here are some results (higher scores are better).
Base : 1 (std. dev : 91%)
Numa sched : 2.17 (std. dev : 15%)
Autonuma : 2.56 (std. dev : 10.7%)
Numa sched is ~200% better compared to "base" case. Autonuma is ~18% better
compared to numasched. Note the high standard deviation in base case.
Also given the differences in base kernel versions for both, this is
admittedly not a apple-2-apple comparison. Getting both patches onto
common code base would help do that type of comparison!
Details:
Base = tip (ee415e2) + numasched patches posted on 3/16.
qemu-kvm 0.12.1
Numa sched = tip (ee415e2) + numasched patches posted on 3/16.
Modified version of qemu-kvm 1.0.50 that creates memsched groups
Autonuma = Autonuma alpha10 (SHA1 4596315). qemu-kvm 0.12.1
Machine with 2 Quad-core (w/ HT) Intel Nehalem CPUs. Two NUMA nodes each with
8GB memory.
3 VMs are created:
VM1 and VM2, each with 4vcpus, 3GB memory and 1024 cpu.shares.
Each of them runs memory hogs to consume total of 2.5 GB total
memory (2.5GB memory first written to and then continuously read in a
loop)
VM3 of 8vcpus, 4GB memory and 2048 cpu.shares. Runs
SPECJbb2000 benchmark w/ 8 warehouses (and consuming 2GB heap)
Benchmark was repeated 5 times. Each run consisted of launching VM1 first,
waiting for it to initialize (wrt memory footprint), launching VM2 next, waiting
for it to initialize before launching VM3 and the benchmark inside VM3. At the
end of benchmark, all VMs are destroyed and process repeated.
- vatsa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists