lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 17 May 2016 09:31:12 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Josh Poimboeuf <jpoimboe@...hat.com>
Cc:	Ingo Molnar <mingo@...nel.org>,
	Matt Fleming <matt@...eblueprint.co.uk>,
	Alex Thorlton <athorlton@....com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Borislav Petkov <bp@...en8.de>,
	Andy Lutomirski <luto@...nel.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Denys Vlasenko <dvlasenk@...hat.com>
Subject: Re: [PATCH] x86/asm/entry: fix stack return address retrieval in thunk

On Tue, May 17, 2016 at 7:43 AM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
>
> index 98df1fa..dae7ca0 100644
> --- a/arch/x86/entry/thunk_64.S
> +++ b/arch/x86/entry/thunk_64.S
> @@ -15,9 +15,10 @@
>         .globl \name
>         .type \name, @function
>  \name:
> +       /* push 1 register if frame pointers are enabled */
>         FRAME_BEGIN
>
> -       /* this one pushes 9 elems, the next one would be %rIP */
> +       /* push 9 registers */

I don't hate this patch, but quite frankly, as with the other case,
I'd just make the frame pointer be unconditional in this case.

If we push nine other registers, the frame pointer setup code is *not*
going to matter.

The reason to avoid frame pointers in code generation is two-fold:

 1) for small leaf functions, it often ends up dominating

 2) it removes a register that is otherwise usable, which can be
particularly bad on 32-bit x86 due to the much more limited number of
registers (and was apparently really noticeable on the older on-order
atom cores)

and in this case neither of them is really an issue.

So I would suggest that any case that actually depends on a frame
access just make the frame pointer not just unconditional, but
_explicit_.

So not just avoiding the macro because it's conditional, but write out
the sequence to actually set up the frame, and then use

-       movq 9*8(%rsp), %rdi
+       movq 8(%rbp), %rdi   # return address

to entirely avoid all kind of "how many registers have we pushed" math.

Considering that we got this wrong in two places, it's clearly too
subtle for our little brains as-is.

               Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ