[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <65767CEA-370F-4180-9A81-116BE0BE2EA2@vmware.com>
Date: Thu, 4 Oct 2018 08:40:34 +0000
From: Nadav Amit <namit@...are.com>
To: Ingo Molnar <mingo@...nel.org>
CC: Ingo Molnar <mingo@...hat.com>,
LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>, Jan Beulich <JBeulich@...e.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Andy Lutomirski <luto@...nel.org>
Subject: Re: [PATCH v9 04/10] x86: refcount: prevent gcc distortions
at 12:57 AM, Ingo Molnar <mingo@...nel.org> wrote:
>
> * Nadav Amit <namit@...are.com> wrote:
>
>> GCC considers the number of statements in inlined assembly blocks,
>> according to new-lines and semicolons, as an indication to the cost of
>> the block in time and space. This data is distorted by the kernel code,
>> which puts information in alternative sections. As a result, the
>> compiler may perform incorrect inlining and branch optimizations.
>>
>> The solution is to set an assembly macro and call it from the inlined
>> assembly block. As a result GCC considers the inline assembly block as
>> a single instruction.
>>
>> This patch allows to inline functions such as __get_seccomp_filter().
>> Interestingly, this allows more aggressive inlining while reducing the
>> kernel size.
>>
>> text data bss dec hex filename
>> 18140970 10225412 2957312 31323694 1ddf62e ./vmlinux before
>> 18140140 10225284 2957312 31322736 1ddf270 ./vmlinux after (-958)
>>
>> Static text symbols:
>> Before: 40302
>> After: 40286 (-16)
>>
>> Functions such as kref_get(), free_user(), fuse_file_get() now get
>> inlined.
>
> Yeah, so I kind of had your series on the back-burner (I'm sure you noticed!),
> mostly because what I complained about in a previous round of review a couple
> of months ago: that the description of the series and the changelog of every
> single patch in it is tiptoeing around the *real* problem and never truly
> describes it:
>
> ** This is a GCC bug, plain and simple, and we are uglifying **
> ** and complicating kernel assembly code to work it around. **
>
> We'd never ever consider such uglification for Clang, not even _close_.
> Sure this would have warranted a passing mention? Instead the changelogs are
> lovingly calling it a "distortion" as if this was no-one's fault really, and
> the patch a "solution".
>
> How about calling it a "GCC inlining bug" and a "workaround with costs"
> which it is in reality, and stop whitewashing the problem?
>
> At the same time I realize that we still need this series because GCC won't
> get fixed, so as a consolation I wrote the changelog below that explains
> how it really is, no holds barred.
>
> Since the tone of the changelog is a bit ... frosty, I added this disclaimer:
>
> [ mingo: Wrote new changelog. ]
>
> Let me know if you want me to make it more prominent that you had absolutely
> no role in writing that changelog.
>
> I'm also somewhat annoyed at the fact that this series carries a boatload
> of reviewed-by's and acked-by's, yet none of those reviewers found it
> important to point out the large chasm that is gaping between description
> and reality.
So, I’m sorry for missing your comment about misrepresenting the problem.
Feel free to do whatever you want with the commit message (just fix the typo
in “attemt"). As long as you don’t NAK the patches or send me to redo them,
it’s fine. I just want to clarify few things for you to consider.
First, you are right that clang does not have this issue (I checked), but
the behavior of gcc is clearly documented - once you know what to look for.
Second, I think the end result is not as ugly as you make it sound (and
maybe not ugly at all). Using this patch-set, you can write big blocks of
inlined assembly code without having those disgusting C macros. You can also
share the same code between inline asm and asm files.
You can have a look, for example, on ALTERNATIVE which is defined both as
assembly macro and C macro. Is the C macro readable? Is it easy to maintain
two different version? I do have a patch that merges the two implementations
together (which I still didn’t send, since I wait for the infra to be
applied first), and I think makes much more sense.
Finally, note that it’s not as if the binary always becomes smaller.
Overall, with the full patch-set it is slightly bigger. But still, that’s
how it was supposed to be if gcc wasn’t doing things badly.
Thanks again,
Nadav
Powered by blists - more mailing lists