[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4FC94505.3090506@linux.vnet.ibm.com>
Date: Fri, 01 Jun 2012 19:41:09 -0300
From: Mauricio Faria de Oliveira <mauricfo@...ux.vnet.ibm.com>
To: Andrea Arcangeli <aarcange@...hat.com>
CC: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Hillf Danton <dhillf@...il.com>, Dan Smith <danms@...ibm.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>, Paul Turner <pjt@...gle.com>,
Suresh Siddha <suresh.b.siddha@...el.com>,
Mike Galbraith <efault@....de>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Lai Jiangshan <laijs@...fujitsu.com>,
Bharata B Rao <bharata.rao@...il.com>,
Lee Schermerhorn <Lee.Schermerhorn@...com>,
Rik van Riel <riel@...hat.com>,
Johannes Weiner <hannes@...xchg.org>,
Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
Christoph Lameter <cl@...ux.com>, srikar@...ux.vnet.ibm.com,
mjw@...ux.vnet.ibm.com
Subject: Re: [PATCH 00/35] AutoNUMA alpha14
Hi Andrea, everyone..
AA> Changelog from alpha13 to alpha14:
AA> [...]
AA> o autonuma_balance only runs along with run_rebalance_domains, to
AA> avoid altering the scheduler runtime. [...]
AA> [...] This change has not
AA> yet been tested on specjbb or more schedule intensive benchmarks,
AA> but I don't expect measurable NUMA affinity regressions. [...]
Perhaps I can contribute a bit to the SPECjbb tests.
I got SPECjbb2005 results for 3.4-rc2 mainline, numasched,
autonuma-alpha10, and autonuma-alpha13. If you judge the data is OK it
may suit a comparison between autonuma-alpha13/14 to verify NUMA
affinity regressions.
The system is an Intel 2-socket Blade. Each NUMA node has 6 cores (+6
hyperthreads) and 12 GB RAM. Different permutations of THP, KSM, and VM
memory size were tested for each kernel.
I'll have to leave the analysis of each variable for you, as I'm not
familiar w/ the code and expected impacts; but I'm perfectly fine with
providing more details about the tests, environment and procedures, and
even some reruns, if needed.
Please CC me on questions and comments.
Environment:
------------
Host:
- Enterprise Linux Distro
- Kernel: 3.4-rc2 (either mainline, or patched w/ numasched,
autonuma-alpha10, or autonuma-alpha13)
- 2 NUMA nodes. 6 cores + 6 hyperthreads/node, 12 GB RAM/node.
(total of 24 logical CPUs and 24 GB RAM)
- Hypervisor: qemu-kvm 1.0.50 (+ memsched patches only for numasched)
VMs:
- Enterprise Linux Distro
- Distro Kernel
1 Main VM (VM1) -- relevant benchmark score.
- 12 vCPUs
- 12 GB (for '< 1 Node' configuration) or 14 GB (for '> 1 Node'
configuration)
2 Noise VMs (VM2 and VM3)
- each noise VM has half of the remaining resources.
- 6 vCPUs
- 4 GB (for '< 1 Node' configuration) or 3 GB ('> 1 Node' configuration)
(to sum 20 GB w/ main VM + 4 GB for host = total 24 GB)
Settings:
- Swapping disabled on host and VMs.
- Memory Overcommit enabled on host and VMs.
- THP on host is a variable. THP disabled on VMs.
- KSM on host is a variable. KSM disabled on VMs.
Results
=======
Reference is mainline kernel with THP disabled (its score is
approximately 100%). It performed similarly (less than 2% difference) on
the 4 permutations of KSM and Main VM memory size.
For the results of all permutations, see chart [1].
One interesting permutation seems to be: No THP (disabled); KSM (enabled).
Interpretation:
- higher is better;
- main VM should perform better than noise VMs;
- noise VMs should perform similarly.
Main VM < 1 Node
-----------------
Main VM Noise VM Noise VM
mainline ~100% 60% 60%
numasched * 50%/135% 30%/58% 40%/68%
autonuma-a10 125% 60% 60%
autonuma-a13 126% 32% 32%
* numasched yielded a wide range of scores. Is this behavior expected?
Main VM > 1 Node.
-----------------
Main VM Noise VM Noise VM
mainline ~100% 60% 59%
numasched 60% 48% 48%
autonuma-a10 62% 37% 38%
autonuma-a13 125% 61% 63%
Considerations:
---------------
The 3 VMs ran SPECjbb2005, synchronously starting the benchmark.
For the benchmark run to take about the same time on the 3 VMs, its
configuration for the Noise VMs is different than for the Main VM.
So comparing VM1 scores w/ VM2 or VM3 scores is not reasonable.
But comparing scores between VM2 and VM3 is perfectly fine (it's
evidence of the performed balancing).
Sometimes both autonuma and numasched prioritized one of the Noise VMs
over the other Noise VM, or even over the Main VM. In these cases, some
reruns would yield scores of 'expected proportion', given the VMs
configuration (Main VM w/ the highest score, both Noise VMs with lower
scores which are about the same).
The non-expected proportion scores happened less often w/
autonuma-alpha13, followed by autonuma-alpha10, and finally numasched
(i.e., numasched had the greatest rate of non-expected proportion scores).
For most permutations, numasched didn't yield scores of expected
proportion. I'd like to know how likely this is to happen, before
performing additional runs to confirm it. If anyone would provide
evidence or thoughts?
Links:
------
[1] http://dl.dropbox.com/u/82832537/kvm-numa-comparison-0.png
--
Mauricio Faria de Oliveira
IBM Linux Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists