[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1376663644-153546-1-git-send-email-athorlton@sgi.com>
Date: Fri, 16 Aug 2013 09:33:56 -0500
From: Alex Thorlton <athorlton@....com>
To: linux-kernel@...r.kernel.org
Cc: Alex Thorlton <athorlton@....com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...e.de>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Rik van Riel <riel@...hat.com>,
Johannes Weiner <hannes@...xchg.org>,
"Eric W . Biederman" <ebiederm@...ssion.com>,
Sedat Dilek <sedat.dilek@...il.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Dave Jones <davej@...hat.com>,
Michael Kerrisk <mtk.manpages@...il.com>,
"Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
David Howells <dhowells@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Al Viro <viro@...iv.linux.org.uk>,
Oleg Nesterov <oleg@...hat.com>,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
Kees Cook <keescook@...omium.org>, Robin Holt <holt@....com>
Subject: [PATCH 0/8] Re: [PATCH] Add per-process flag to control thp
Here are the results from one of the benchmarks that performs
particularly poorly when thp is enabled. Unfortunately the vclear
patches don't seem to provide a performance boost. I've attached
the patches that include the changes I had to make to get the vclear
patches applied to the latest kernel.
This first set of tests was run on the latest community kernel, with the
vclear patches:
Kernel string: Kernel 3.11.0-rc5-medusa-00021-g1a15a96-dirty
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# time ./run.sh
...
Done. Terminating the simulation.
real 25m34.052s
user 10769m7.948s
sys 37m46.524s
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# echo never > /sys/kernel/mm/transparent_hugepage/enabled
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# time ./run.sh
...
Done. Terminating the simulation.
real 5m0.377s
user 2202m0.684s
sys 108m31.816s
Here are the same tests on the clean kernel:
Kernel string: Kernel 3.11.0-rc5-medusa-00013-g584d88b
Kernel string: Kernel 3.11.0-rc5-medusa-00013-g584d88b
athorlton@...p31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l> cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
athorlton@...p31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l> time ./run.sh
...
Done. Terminating the simulation.
real 21m44.052s
user 10809m55.356s
sys 39m58.300s
harp31-sys:~ # echo never > /sys/kernel/mm/transparent_hugepage/enabled
athorlton@...p31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l> cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
athorlton@...p31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l> time ./run.sh
...
Done. Terminating the simulation.
real 4m52.502s
user 2127m18.548s
sys 104m50.828s
Working on getting some more information about the root of the
performance issues now...
Alex Thorlton (8):
THP: Use real address for NUMA policy
mm: make clear_huge_page tolerate non aligned address
THP: Pass real, not rounded, address to clear_huge_page
x86: Add clear_page_nocache
mm: make clear_huge_page cache clear only around the fault address
x86: switch the 64bit uncached page clear to SSE/AVX v2
remove KM_USER0 from kmap_atomic call
fix up references to kernel_fpu_begin/end
arch/x86/include/asm/page.h | 2 +
arch/x86/include/asm/string_32.h | 5 ++
arch/x86/include/asm/string_64.h | 5 ++
arch/x86/lib/Makefile | 1 +
arch/x86/lib/clear_page_nocache_32.S | 30 ++++++++++++
arch/x86/lib/clear_page_nocache_64.S | 92 ++++++++++++++++++++++++++++++++++++
arch/x86/mm/fault.c | 7 +++
mm/huge_memory.c | 17 +++----
mm/memory.c | 31 ++++++++++--
9 files changed, 179 insertions(+), 11 deletions(-)
create mode 100644 arch/x86/lib/clear_page_nocache_32.S
create mode 100644 arch/x86/lib/clear_page_nocache_64.S
--
1.7.12.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists