linux-kernel - Re: [PATCH 0/6] Macrofying inline assembly for better compilation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4F1E3042-CF9A-423C-BDB2-80ACD1B98488@vmware.com>
Date:   Fri, 18 May 2018 14:15:38 +0000
From:   Nadav Amit <namit@...are.com>
To:     David Laight <David.Laight@...LAB.COM>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>,
        Alok Kataria <akataria@...are.com>,
        Christopher Li <sparse@...isli.org>,
        "H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
        Jan Beulich <JBeulich@...e.com>,
        Jonathan Corbet <corbet@....net>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Juergen Gross <jgross@...e.com>,
        Kees Cook <keescook@...omium.org>,
        "linux-sparse@...r.kernel.org" <linux-sparse@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "virtualization@...ts.linux-foundation.org" 
        <virtualization@...ts.linux-foundation.org>
Subject: Re: [PATCH 0/6] Macrofying inline assembly for better compilation

David Laight <David.Laight@...LAB.COM> wrote:

> From: Nadav Amit
>> Sent: 17 May 2018 17:14
>> This patch-set deals with an interesting yet stupid problem: kernel code
>> that does not get inlined despite its simplicity. There are several
>> causes for this behavior: "cold" attribute on __init, different function
>> optimization levels; conditional constant computations based on
>> __builtin_constant_p(); and finally large inline assembly blocks.
>> 
>> This patch-set deals with the inline assembly problem. I separated these
>> patches from the others (that were sent in the RFC) for easier
>> inclusion.
>> 
>> The problem with inline assembly is that inline assembly is often used
>> by the kernel for things that are other than code - for example,
>> assembly directives and data. GCC however is oblivious to the content of
>> the blocks and assumes their cost in space and time is proportional to
>> the number of the perceived assembly "instruction", according to the
>> number of newlines and semicolons. Alternatives, paravirt and other
>> mechanisms are affected, causing code not to be inlined, and degrading
>> compilation quality in general.
>> 
>> The solution that this patch-set carries for this problem is to create
>> an assembly macro, and then call it from the inline assembly block.  As
>> a result, the compiler sees a single "instruction" and assigns the more
>> appropriate cost to the code. In addition, this patch-set removes
>> unneeded new-lines from common x86 inline asm's, which "confuse" GCC
>> heuristics.
> 
> Can't you get the same effect by using always_inline ?

I wanted and forgot to mention in the cover-letter why always_inline is not
a proper solution:

1. It is not easy to go over 400 functions and mark them as __always_inline.
   Maintaining it afterwards (i.e., removing the __always_inline if the
   function is changed and becomes “heavier") is even harder.

2. The kernel can be configured in a many ways, which would make
   functions more “cheaper” or more “expensive”, so you cannot always
   predetermine whether a function should be inlined.

3. If you mark a function __always_inline you can just cause the calling
   function not to be inlined (when it should be inlined as well). It becomes
   a whack-a-mole.

4. It is not only about inlining. The compiler also makes branch decisions
   based on the perceived cost of the code, including inlined function.

Regards,
Nadav