[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFwLveBBp7mfe9k=rCL89Tviy23=0YVof-PFwuznddgHoQ@mail.gmail.com>
Date: Fri, 15 Sep 2017 11:01:19 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Andrey Ryabinin <aryabinin@...tuozzo.com>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>,
"the arch/x86 maintainers" <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Alexander Potapenko <glider@...gle.com>,
Dmitriy Vyukov <dvyukov@...gle.com>,
Matthias Kaehlcke <mka@...omium.org>,
Arnd Bergmann <arnd@...db.de>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [RFC PATCH 3/4] x86/asm: Make alternative macro interfaces more
clear and consistent
On Fri, Sep 15, 2017 at 9:53 AM, Andrey Ryabinin
<aryabinin@...tuozzo.com> wrote:
>
> I'm not so sure that this is disabled optimization. I assume that global rsp makes
> changes something in gcc's register allocation logic, or something like that leading
> to subtle changes in generated code.
>
> But what I recently find out, is that this "regression" sometimes is actually improvement in .text size.
> It all depends on .config, e.g:
Oh, that would be lovely and solve all the issues.
And looking at the code generation differences for one file
(kernel/futex.c) and one single config (my default config), the thing
that the global stack register seems to change is that it moves some
code - particularly completely unrelated inline asm code - inside the
region protected by frame pointers.
There are a few register allocation changes too, but they didn't seem
to make code worse, and I think they were just "incidental" from code
movement. And most code movement really seemed to be around inline
asms, I wonder if the gcc logic simply is something like "if the
stack pointer is visible as a register, don't move any inline asm
across a frame setup".
In fact, on that one file and one configuration, the resulting
assembler file had three fewer lines of code with that global stack
register declaration than with the local one.
So at least from just that one case, I can back up Andrey's
observation: it's not that the code gets worse, it just is slightly
different. Sometimes it's better.
So maybe that simple patch to just make the stack pointer be a global
register declaration really is the fix for this issue.
It's not *pretty*, and I'd much rather just see some explicit way for
us to say "this asm wants the frame to be set up", but of the
alternatives we've seen, maybe it's the right thing to do?
Linus
Powered by blists - more mailing lists