lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 8 Feb 2023 22:21:12 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Andrew.Cooper3@...rix.com
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org, mhiramat@...nel.org,
        kirill.shutemov@...ux.intel.com, jpoimboe@...hat.com
Subject: Re: [PATCH v3 3/4] x86/alternative: Rewrite optimize_nops() some

On Wed, Feb 08, 2023 at 10:08:12PM +0100, Peter Zijlstra wrote:
> On Wed, Feb 08, 2023 at 09:44:04PM +0100, Peter Zijlstra wrote:
> 
> > [   11.584069] SMP alternatives: ffffffff82000095: [0:20) optimized NOPs: eb 12 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> > [   11.590068] SMP alternatives: ffffffff820001f3: [0:20) optimized NOPs: eb 12 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> > [   11.720069] SMP alternatives: ffffffff8200189f: [0:20) optimized NOPs: eb 12 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> > [   11.731069] SMP alternatives: ffffffff820019ae: [0:20) optimized NOPs: eb 12 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> > [   11.738069] SMP alternatives: ffffffff82001a4a: [0:20) optimized NOPs: eb 12 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> > [   11.746069] SMP alternatives: ffffffff82001b2d: [0:20) optimized NOPs: eb 12 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> > [   11.766069] SMP alternatives: ffffffff82001d14: [0:20) optimized NOPs: eb 12 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> > [   11.770069] SMP alternatives: ffffffff82001dd5: [0:20) optimized NOPs: eb 12 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> > [   11.779069] SMP alternatives: ffffffff82001f35: [0:20) optimized NOPs: eb 12 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> 
> UNTRAIN_RET -- specifically RESET_CALL_DEPTH

19:       48 c7 c0 80 00 00 00    mov    $0x80,%rax
20:       48 c1 e0 38             shl    $0x38,%rax
24:       65 48 89 04 25 00 00 00 00      mov    %rax,%gs:0x0     29: R_X86_64_32S        pcpu_hot+0x10

Is ofc an atrocity.

We can easily trim that by 5 bytes to:

0:   b0 80                   mov    $0x80,%al
2:   48 c1 e0 38             shl    $0x38,%rax
6:   65 48 89 04 25 00 00 00 00      mov    %rax,%gs:0x0

Who cares about the top bytes, we're explicitly shifting them out
anyway. But that's still 15 bytes or so.

If it weren't for those pesky prefix penalties that would make exactly
one instruction :-)


diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index e04313e89f4f..be792f9407b5 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -84,7 +84,7 @@
 	movq	$-1, PER_CPU_VAR(pcpu_hot + X86_call_depth);
 
 #define RESET_CALL_DEPTH					\
-	mov	$0x80, %rax;					\
+	movb	$0x80, %al;					\
 	shl	$56, %rax;					\
 	movq	%rax, PER_CPU_VAR(pcpu_hot + X86_call_depth);
 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ