lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 18 Sep 2017 12:40:31 -0500
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Andrey Ryabinin <aryabinin@...tuozzo.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        the arch/x86 maintainers <x86@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...nel.org>,
        Alexander Potapenko <glider@...gle.com>,
        Dmitriy Vyukov <dvyukov@...gle.com>,
        Matthias Kaehlcke <mka@...omium.org>,
        Arnd Bergmann <arnd@...db.de>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RFC PATCH 3/4] x86/asm: Make alternative macro interfaces more
 clear and consistent

On Sun, Sep 17, 2017 at 01:22:32AM +0300, Andrey Ryabinin wrote:
> On 09/16/2017 02:29 AM, Josh Poimboeuf wrote:
> > On Fri, Sep 15, 2017 at 11:01:19AM -0700, Linus Torvalds wrote:
> >> On Fri, Sep 15, 2017 at 9:53 AM, Andrey Ryabinin
> >> <aryabinin@...tuozzo.com> wrote:
> >>>
> >>> I'm not so sure that this is disabled optimization. I assume that global rsp makes
> >>> changes something in gcc's register allocation logic, or something like that leading
> >>> to subtle changes in generated code.
> >>>
> >>> But what I recently find out, is that this "regression" sometimes is actually improvement in .text size.
> >>> It all depends on .config, e.g:
> >>
> >> Oh, that would be lovely and solve all the issues.
> >>
> >> And looking at the code generation differences for one file
> >> (kernel/futex.c) and one single config (my default config), the thing
> >> that the global stack register seems to change is that it moves some
> >> code - particularly completely unrelated inline asm code - inside the
> >> region protected by frame pointers.
> >>
> >> There are a few register allocation changes too, but they didn't seem
> >> to make code worse, and I think they were just "incidental" from code
> >> movement. And most code movement really seemed to be around inline
> >> asms, I  wonder if the gcc logic simply is something like "if the
> >> stack pointer is visible as a register, don't move any inline asm
> >> across a frame setup".
> >>
> >> In fact, on that one file and one configuration, the resulting
> >> assembler file had three fewer lines of code with that global stack
> >> register declaration than with the local one.
> >>
> >> So at least from just that one case, I can back up Andrey's
> >> observation: it's not that the code gets worse, it just is slightly
> >> different. Sometimes it's better.
> >>
> >> So maybe that simple patch to just make the stack pointer be a global
> >> register declaration really is the fix for this issue.
> >>
> >> It's not *pretty*, and I'd much rather just see some explicit way for
> >> us to say "this asm wants the frame to be set up", but of the
> >> alternatives we've seen, maybe it's the right thing to do?
> > 
> > Ok, here's the (compile tested) patch in case anybody wants to try it
> > out.
> > 
> 
> I already had almost the same patch, so I just used mine. Patch seems to be
> the same as yours except cosmetic details which shouldn't affect the end result.
> But just in case it posted below.
> 
> 
> The patch was applied on top of b38923a068c10fc36ca8f596d650d095ce390b85 commit,
> kernel compiled gcc 7.2.0 
> 
> allnoconfig:   text    data     bss     dec     hex filename
> base           946920  206384 1215752 2369056  242620 allnoconfig/vmlinux
> patched        946501  206384 1215752 2368637  24247d allnoconfig_p/vmlinux
> 
> defconfig:     text    data     bss     dec     hex filename
> base           10729978        4840000  876544 16446522         faf43a defconfig/vmlinux
> patched        10730666        4840000  876544 16447210         faf6ea defconfig_p/vmlinux
> 
> allyesconfig:   text    data     bss     dec     hex filename
> base            161016572       152888347       49303552        363208471       15a61f17        allyesconfig/vmlinux
> patched         161003557       152888347       49303552        363195456       15a5ec40        allyesconfig_p/vmlinux

I get similar results with my config: slightly smaller .text, both with
*and* without frame pointers.  So yeah, this is probably the best
option, or at least, the least-worst option.  I shouldn't have dismissed
the idea so quickly before.

(H.J. Lu suggested another idea, which is to use "rbp" as an input
constraint.  I tried it, but it added 22k of text to my kernel with
frame pointers disabled, so that's definitely worse than this.)

If testing continues to goes well, I'll submit an official patch soon.

-- 
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ