[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120621145552.GG4954@redhat.com>
Date: Thu, 21 Jun 2012 16:55:52 +0200
From: Andrea Arcangeli <aarcange@...hat.com>
To: Alex Shi <lkml.alex@...il.com>
Cc: Petr Holasek <pholasek@...hat.com>,
"Kirill A. Shutemov" <kirill@...temov.name>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Hillf Danton <dhillf@...il.com>, Dan Smith <danms@...ibm.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>, Paul Turner <pjt@...gle.com>,
Suresh Siddha <suresh.b.siddha@...el.com>,
Mike Galbraith <efault@....de>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Lai Jiangshan <laijs@...fujitsu.com>,
Bharata B Rao <bharata.rao@...il.com>,
Lee Schermerhorn <Lee.Schermerhorn@...com>,
Rik van Riel <riel@...hat.com>,
Johannes Weiner <hannes@...xchg.org>,
Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
Christoph Lameter <cl@...ux.com>,
Alex Shi <alex.shi@...el.com>,
"Chen, Tim C" <tim.c.chen@...el.com>
Subject: Re: AutoNUMA15
On Thu, Jun 21, 2012 at 03:29:52PM +0800, Alex Shi wrote:
> > I released an AutoNUMA15 branch that includes all pending fixes:
> >
> > git clone --reference linux -b autonuma15 git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git
> >
>
> I did a quick testing on our
> specjbb2005/oltp/hackbench/tbench/netperf-loop/fio/ffsb on NHM EP/EX,
> Core2 EP, Romely EP machine, In generally no clear performance change
> found. Is this results expected for this patch set?
hackbench and network benchs won't get benefit (the former
overschedule like crazy so there's no way any autonuma balancing can
have effect with such an overscheduling and zillion of threads, the
latter is I/O dominated usually taking so little RAM it doesn't
matter, the memory accesses on the kernel side and DMA issue should
dominate it in CPU utilization). Similar issue for filesystem
benchmarks like fio.
On all _system_ time dominated kernel benchmarks it is expected not to
measure a performance optimization and if you don't measure a
regression it's more than enough.
The only benchmarks that gets benefit are userland where the user/nice
time in top dominates. AutoNUMA cannot optimize or move kernel memory
around, it only optimizes userland computations.
So you should run HPC jobs. The only strange thing here is that
specjbb2005 gets a measurable significant boost with AutoNUMA so if
you didn't even get a boost with that you may want to verify:
cat /sys/kernel/mm/autonuma/enabled == 1
Also verify:
CONFIG_AUTONUMA_DEFAULT_ENABLED=y
If that's 1 well maybe the memory interconnect is so fast that there's
no benefit?
My numa01/02 benchmarks measures the best worst case of the hardware
(not software), with -DINVERSE_BIND -DHARD_BIND parameters, you can
consider running that to verify.
Probably there should be a little boot time kernel benchmark to
measure the inverse bind vs hard bind performance across the first two
nodes, if the difference is nil AutoNUMA should disengage and not even
allocate the page_autonuma (now only 12 bytes per page but anyway).
If you can retest with autonuma17 it would help too as there was some
performance issue fixed and it'd stress the new autonuma migration lru
code:
git clone --reference linux -b autonuma17 git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git autonuma17
And the very latest is always at the autonuma branch:
git clone --reference linux -b autonuma git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git autonuma
Thanks,
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists