lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <tip-236222d39347e0e486010f10c1493e83dbbdfba8@git.kernel.org>
Date:   Fri, 30 Jun 2017 06:10:59 -0700
From:   tip-bot for Paolo Abeni <tipbot@...or.com>
To:     linux-tip-commits@...r.kernel.org
Cc:     bp@...en8.de, hpa@...or.com, pabeni@...hat.com, mingo@...nel.org,
        luto@...nel.org, brgerst@...il.com, dvlasenk@...hat.com,
        keescook@...omium.org, tglx@...utronix.de,
        hannes@...essinduktion.org, peterz@...radead.org,
        jpoimboe@...hat.com, linux-kernel@...r.kernel.org,
        gnomes@...rguk.ukuu.org.uk, torvalds@...ux-foundation.org
Subject: [tip:x86/asm] x86/uaccess: Optimize
 copy_user_enhanced_fast_string() for short strings

Commit-ID:  236222d39347e0e486010f10c1493e83dbbdfba8
Gitweb:     http://git.kernel.org/tip/236222d39347e0e486010f10c1493e83dbbdfba8
Author:     Paolo Abeni <pabeni@...hat.com>
AuthorDate: Thu, 29 Jun 2017 15:55:58 +0200
Committer:  Ingo Molnar <mingo@...nel.org>
CommitDate: Fri, 30 Jun 2017 09:52:51 +0200

x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings

According to the Intel datasheet, the REP MOVSB instruction
exposes a pretty heavy setup cost (50 ticks), which hurts
short string copy operations.

This change tries to avoid this cost by calling the explicit
loop available in the unrolled code for strings shorter
than 64 bytes.

The 64 bytes cutoff value is arbitrary from the code logic
point of view - it has been selected based on measurements,
as the largest value that still ensures a measurable gain.

Micro benchmarks of the __copy_from_user() function with
lengths in the [0-63] range show this performance gain
(shorter the string, larger the gain):

 - in the [55%-4%] range on Intel Xeon(R) CPU E5-2690 v4
 - in the [72%-9%] range on Intel Core i7-4810MQ

Other tested CPUs - namely Intel Atom S1260 and AMD Opteron
8216 - show no difference, because they do not expose the
ERMS feature bit.

Signed-off-by: Paolo Abeni <pabeni@...hat.com>
Acked-by: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Alan Cox <gnomes@...rguk.ukuu.org.uk>
Cc: Andy Lutomirski <luto@...nel.org>
Cc: Borislav Petkov <bp@...en8.de>
Cc: Brian Gerst <brgerst@...il.com>
Cc: Denys Vlasenko <dvlasenk@...hat.com>
Cc: H. Peter Anvin <hpa@...or.com>
Cc: Hannes Frederic Sowa <hannes@...essinduktion.org>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>
Cc: Kees Cook <keescook@...omium.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Link: http://lkml.kernel.org/r/4533a1d101fd460f80e21329a34928fad521c1d4.1498744345.git.pabeni@redhat.com
[ Clarified the changelog. ]
Signed-off-by: Ingo Molnar <mingo@...nel.org>
---
 arch/x86/lib/copy_user_64.S | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S
index c595957..020f75c 100644
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -37,7 +37,7 @@ ENTRY(copy_user_generic_unrolled)
 	movl %edx,%ecx
 	andl $63,%edx
 	shrl $6,%ecx
-	jz 17f
+	jz .L_copy_short_string
 1:	movq (%rsi),%r8
 2:	movq 1*8(%rsi),%r9
 3:	movq 2*8(%rsi),%r10
@@ -58,7 +58,8 @@ ENTRY(copy_user_generic_unrolled)
 	leaq 64(%rdi),%rdi
 	decl %ecx
 	jnz 1b
-17:	movl %edx,%ecx
+.L_copy_short_string:
+	movl %edx,%ecx
 	andl $7,%edx
 	shrl $3,%ecx
 	jz 20f
@@ -174,6 +175,8 @@ EXPORT_SYMBOL(copy_user_generic_string)
  */
 ENTRY(copy_user_enhanced_fast_string)
 	ASM_STAC
+	cmpl $64,%edx
+	jb .L_copy_short_string	/* less then 64 bytes, avoid the costly 'rep' */
 	movl %edx,%ecx
 1:	rep
 	movsb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ