lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 11 Aug 2017 11:57:13 -0500
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     Ingo Molnar <mingo@...nel.org>, Brian Gerst <brgerst@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Arnd Bergmann <arnd@...db.de>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Borislav Petkov <bp@...en8.de>,
        "linux-tip-commits@...r.kernel.org" 
        <linux-tip-commits@...r.kernel.org>
Subject: Re: [tip:x86/asm] objtool: Track DRAP separately from callee-saved
 registers

On Fri, Aug 11, 2017 at 09:22:11AM -0700, Andy Lutomirski wrote:
> On Fri, Aug 11, 2017 at 5:13 AM, tip-bot for Josh Poimboeuf
> <tipbot@...or.com> wrote:
> > Commit-ID:  bf4d1a83758368c842c94cab9661a75ca98bc848
> > Gitweb:     http://git.kernel.org/tip/bf4d1a83758368c842c94cab9661a75ca98bc848
> > Author:     Josh Poimboeuf <jpoimboe@...hat.com>
> > AuthorDate: Thu, 10 Aug 2017 16:37:26 -0500
> > Committer:  Ingo Molnar <mingo@...nel.org>
> > CommitDate: Fri, 11 Aug 2017 14:06:15 +0200
> >
> > objtool: Track DRAP separately from callee-saved registers
> >
> > When GCC realigns a function's stack, it sometimes uses %r13 as the DRAP
> > register, like:
> >
> >   push  %r13
> >   lea   0x10(%rsp), %r13
> >   and   $0xfffffffffffffff0, %rsp
> >   pushq -0x8(%r13)
> >   push  %rbp
> >   mov   %rsp, %rbp
> >   push  %r13
> >   ...
> >   mov   -0x8(%rbp),%r13
> >   leaveq
> >   lea   -0x10(%r13), %rsp
> >   pop   %r13
> >   retq
> >
> 
> I have a couple questions, mainly to help me understand.
> 
> Question 1: What does DRAP stand for?  Duplicate Return Address
> Pointer?  Dynamic ReAlignment Pointer?  I tried searching and got
> nothing.

It seems to be a GCC invention which stands for:

  Dynamic Realign Argument Pointer.

I don't think it's documented anywhere, but there's at least some
comments about it in the GCC sources if you search for DRAP.

> Question 2: What's up with the resulting stack layout?  It seems we have:
> 
> caller's last stack slot  <-- r13 in function body points here
> return address
> old r13
> [possible padding for alignment]
> return address, duplicated (for naive unwinder's benefit?)
> old rbp  <-- rbp in body points here
> new r13, i.e. pointer to caller's last stack slot
> 
> Now we have the function body, and r13 is free for use in here because
> it's saved.
> 
> In the epilogue, we recover r13, use leaveq (hmm, shorter than pop
> %rbp but does more work than needed), restore the old r13, and return.
> 
> I don't get it, though.  gcc only ever uses that inner r13 with an
> offset.  The code would be considerably shorter if the second
> instruction were just mov %rsp, %r13.  That would change the push to
> pushq 0x8(%rsp) and the third-to-last instruction to mov %r13, %rsp,
> saving something like 8 bytes of code.

I don't know why it doesn't do it the way you suggest, but I'm glad it
doesn't because I think it would make the DWARF/ORC data even more
complicated.  Here it's "simple", because r13 == DWARF CFA.

> I also don't get why any of this is needed.  Couldn't the compiler
> just do push %rbp; mov %rsp, %rbp; and $0xfffffffffffffff0, %rsp and
> be done with it?

Good question.  I wish it did just use the frame pointer, because
dealing with DRAP has been a headache.

> I compiled this:
> 
> void func()
> {
>     int var __attribute__((aligned(32)));
>     asm volatile ("" :: "m" (var));
> }
> 
> and got:
> 
> func:
>     leaq    8(%rsp), %r10
>     andq    $-32, %rsp
>     pushq    -8(%r10)
>     pushq    %rbp
>     movq    %rsp, %rbp
>     pushq    %r10
>     popq    %r10
>     popq    %rbp
>     leaq    -8(%r10), %rsp
>     ret
> 
> Which is better than the crud you pasted, since it at least uses a
> caller-saved reg (r10), but we still have the nasty addressing modes
> *and* an unnecessary push and pop of r10.
> 
> I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81825 and maybe
> some GCC person has a clue what's going on.

I've found that, when it does this DRAP pattern, most of the time it
uses r10.  The r13 version seems to be more rare.  I can provide a
real-world r13 example if that would help.

-- 
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ