lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 23 Aug 2016 19:25:33 +0300
From:   Alexey Dobriyan <adobriyan@...il.com>
To:     x86@...nel.org, linux-kernel@...r.kernel.org
Subject: [PATCH 2/3] x86: support REP MOVSB copy_page()

Microbenchmark shows that "REP MOVSB" copy_page() is faster
than "REP MOVSQ" version on Intel i5-something Haswell
REP_GOOD/ERMS capable CPU.

N=1<<27
rep movsq:	6.758841901 ± 0.04%
rep movsb:	6.253927309 ± 0.02%
-----------------------------------
			-7.5%

Signed-off-by: Alexey Dobriyan <adobriyan@...il.com>
---

 arch/x86/lib/copy_page_64.S |   11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/arch/x86/lib/copy_page_64.S
+++ b/arch/x86/lib/copy_page_64.S
@@ -12,12 +12,21 @@
  */
 	ALIGN
 ENTRY(copy_page)
-	ALTERNATIVE "jmp copy_page_regs", "", X86_FEATURE_REP_GOOD
+	ALTERNATIVE_2 "jmp copy_page_regs",	\
+		"", X86_FEATURE_REP_GOOD,	\
+		"jmp copy_page_rep_movsb", X86_FEATURE_ERMS
+
 	movl	$4096/8, %ecx
 	rep	movsq
 	ret
 ENDPROC(copy_page)
 
+ENTRY(copy_page_rep_movsb)
+	mov	$4096, %ecx
+	rep movsb
+	ret
+ENDPROC(copy_page_rep_movsb)
+
 ENTRY(copy_page_regs)
 	subq	$2*8,	%rsp
 	movq	%rbx,	(%rsp)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ