lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 16 Aug 2013 09:33:56 -0500
From:	Alex Thorlton <athorlton@....com>
To:	linux-kernel@...r.kernel.org
Cc:	Alex Thorlton <athorlton@....com>, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mgorman@...e.de>,
	"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
	Rik van Riel <riel@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	"Eric W . Biederman" <ebiederm@...ssion.com>,
	Sedat Dilek <sedat.dilek@...il.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Dave Jones <davej@...hat.com>,
	Michael Kerrisk <mtk.manpages@...il.com>,
	"Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
	David Howells <dhowells@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Al Viro <viro@...iv.linux.org.uk>,
	Oleg Nesterov <oleg@...hat.com>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Kees Cook <keescook@...omium.org>, Robin Holt <holt@....com>
Subject: [PATCH 0/8] Re: [PATCH] Add per-process flag to control thp

Here are the results from one of the benchmarks that performs
particularly poorly when thp is enabled.  Unfortunately the vclear
patches don't seem to provide a performance boost.  I've attached
the patches that include the changes I had to make to get the vclear
patches applied to the latest kernel.

This first set of tests was run on the latest community kernel, with the
vclear patches:

Kernel string: Kernel 3.11.0-rc5-medusa-00021-g1a15a96-dirty
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# time ./run.sh
...
Done. Terminating the simulation.

real    25m34.052s
user    10769m7.948s
sys     37m46.524s

harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# echo never > /sys/kernel/mm/transparent_hugepage/enabled
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
harp31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l# time ./run.sh
...
Done. Terminating the simulation.

real    5m0.377s
user    2202m0.684s
sys     108m31.816s

Here are the same tests on the clean kernel:

Kernel string: Kernel 3.11.0-rc5-medusa-00013-g584d88b

Kernel string: Kernel 3.11.0-rc5-medusa-00013-g584d88b
athorlton@...p31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l> cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
athorlton@...p31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l> time ./run.sh
...
Done. Terminating the simulation.

real    21m44.052s
user    10809m55.356s
sys     39m58.300s


harp31-sys:~ # echo never > /sys/kernel/mm/transparent_hugepage/enabled
athorlton@...p31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l> cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
athorlton@...p31-sys:/home/estes04/athorlton/Testing/progs/thp_benchmarks/321.equake_l> time ./run.sh
...
Done. Terminating the simulation.

real    4m52.502s
user    2127m18.548s
sys     104m50.828s

Working on getting some more information about the root of the
performance issues now...

Alex Thorlton (8):
  THP: Use real address for NUMA policy
  mm: make clear_huge_page tolerate non aligned address
  THP: Pass real, not rounded, address to clear_huge_page
  x86: Add clear_page_nocache
  mm: make clear_huge_page cache clear only around the fault address
  x86: switch the 64bit uncached page clear to SSE/AVX v2
  remove KM_USER0 from kmap_atomic call
  fix up references to kernel_fpu_begin/end

 arch/x86/include/asm/page.h          |  2 +
 arch/x86/include/asm/string_32.h     |  5 ++
 arch/x86/include/asm/string_64.h     |  5 ++
 arch/x86/lib/Makefile                |  1 +
 arch/x86/lib/clear_page_nocache_32.S | 30 ++++++++++++
 arch/x86/lib/clear_page_nocache_64.S | 92 ++++++++++++++++++++++++++++++++++++
 arch/x86/mm/fault.c                  |  7 +++
 mm/huge_memory.c                     | 17 +++----
 mm/memory.c                          | 31 ++++++++++--
 9 files changed, 179 insertions(+), 11 deletions(-)
 create mode 100644 arch/x86/lib/clear_page_nocache_32.S
 create mode 100644 arch/x86/lib/clear_page_nocache_64.S

-- 
1.7.12.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ