[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F1E3042-CF9A-423C-BDB2-80ACD1B98488@vmware.com>
Date: Fri, 18 May 2018 14:15:38 +0000
From: Nadav Amit <namit@...are.com>
To: David Laight <David.Laight@...LAB.COM>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
Alok Kataria <akataria@...are.com>,
Christopher Li <sparse@...isli.org>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Jan Beulich <JBeulich@...e.com>,
Jonathan Corbet <corbet@....net>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Juergen Gross <jgross@...e.com>,
Kees Cook <keescook@...omium.org>,
"linux-sparse@...r.kernel.org" <linux-sparse@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Randy Dunlap <rdunlap@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>
Subject: Re: [PATCH 0/6] Macrofying inline assembly for better compilation
David Laight <David.Laight@...LAB.COM> wrote:
> From: Nadav Amit
>> Sent: 17 May 2018 17:14
>> This patch-set deals with an interesting yet stupid problem: kernel code
>> that does not get inlined despite its simplicity. There are several
>> causes for this behavior: "cold" attribute on __init, different function
>> optimization levels; conditional constant computations based on
>> __builtin_constant_p(); and finally large inline assembly blocks.
>>
>> This patch-set deals with the inline assembly problem. I separated these
>> patches from the others (that were sent in the RFC) for easier
>> inclusion.
>>
>> The problem with inline assembly is that inline assembly is often used
>> by the kernel for things that are other than code - for example,
>> assembly directives and data. GCC however is oblivious to the content of
>> the blocks and assumes their cost in space and time is proportional to
>> the number of the perceived assembly "instruction", according to the
>> number of newlines and semicolons. Alternatives, paravirt and other
>> mechanisms are affected, causing code not to be inlined, and degrading
>> compilation quality in general.
>>
>> The solution that this patch-set carries for this problem is to create
>> an assembly macro, and then call it from the inline assembly block. As
>> a result, the compiler sees a single "instruction" and assigns the more
>> appropriate cost to the code. In addition, this patch-set removes
>> unneeded new-lines from common x86 inline asm's, which "confuse" GCC
>> heuristics.
>
> Can't you get the same effect by using always_inline ?
I wanted and forgot to mention in the cover-letter why always_inline is not
a proper solution:
1. It is not easy to go over 400 functions and mark them as __always_inline.
Maintaining it afterwards (i.e., removing the __always_inline if the
function is changed and becomes “heavier") is even harder.
2. The kernel can be configured in a many ways, which would make
functions more “cheaper” or more “expensive”, so you cannot always
predetermine whether a function should be inlined.
3. If you mark a function __always_inline you can just cause the calling
function not to be inlined (when it should be inlined as well). It becomes
a whack-a-mole.
4. It is not only about inlining. The compiler also makes branch decisions
based on the perceived cost of the code, including inlined function.
Regards,
Nadav
Powered by blists - more mailing lists