linux-kernel - Re: [PATCH v2] noinstr: Use asm_inline() in instrumentation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aAeFYB7E2QiRNeoM@gmail.com>
Date: Tue, 22 Apr 2025 14:02:40 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Josh Poimboeuf <jpoimboe@...nel.org>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org,
	Peter Zijlstra <peterz@...radead.org>,
	Uros Bizjak <ubizjak@...il.com>
Subject: Re: [PATCH v2] noinstr: Use asm_inline() in
 instrumentation_{begin,end}()


* Josh Poimboeuf <jpoimboe@...nel.org> wrote:

> Use asm_inline() in the instrumentation begin/end macros to prevent the
> compiler from making poor inlining decisions based on the length of the
> objtool annotations.
> 
> Without the objtool annotations, each macro resolves to a single NOP.
> Using inline_asm() seems obviously correct here as it accurately
> communicates the actual code size to the compiler.

s/inline_asm
 /asm_inline

> 
> These macros are used by WARN() and lockdep, so this change can affect a
> lot of functions.
> 
> For a defconfig kernel built with GCC 14.2.1, bloat-o-meter reports a
> 0.17% increase in text size:
> 
>   add/remove: 74/352 grow/shrink: 914/353 up/down: 80747/-47120 (33627)
>   Total: Before=19460272, After=19493899, chg +0.17%
> 
> The text growth is presumably due to increased inlining.  A net total of
> 278 functions were removed (+74 / -352).  Each of the removed functions
> is likely inlined at multiple sites which explains the somewhat
> significant code growth.

So:

 - 353 function shrunk by 47120 bytes, that's -133 bytes per function 
   affected.

 - 914 functions grew by 80747 bytes, that's +88 bytes per function, 
   but there's 3x of them.

That's a lot of net text growth, isn't it? It's certainly not just a 
single instruction or two per inlining, as asm_inline() would suggest.

> One example from Uros:
> 
>     $ grep "<encode_string>"  objdump.old
>    
>     00000000004506e0 <encode_string>:
>      45113c:       e8 9f f5 ff ff          call   4506e0 <encode_string>
>      452bcb:       e9 10 db ff ff          jmp    4506e0 <encode_string>
>      453d33:       e8 a8 c9 ff ff          call   4506e0 <encode_string>
>      453ef7:       e8 e4 c7 ff ff          call   4506e0 <encode_string>
>      45549f:       e8 3c b2 ff ff          call   4506e0 <encode_string>
>      455843:       e8 98 ae ff ff          call   4506e0 <encode_string>
>      455b37:       e8 a4 ab ff ff          call   4506e0 <encode_string>
>      455b47:       e8 94 ab ff ff          call   4506e0 <encode_string>
>      4564fa:       e8 e1 a1 ff ff          call   4506e0 <encode_string>
>      456669:       e8 72 a0 ff ff          call   4506e0 <encode_string>
>      456691:       e8 4a a0 ff ff          call   4506e0 <encode_string>
>      4566a0:       e8 3b a0 ff ff          call   4506e0 <encode_string>
>      4569aa:       e8 31 9d ff ff          call   4506e0 <encode_string>
>      456e79:       e9 62 98 ff ff          jmp    4506e0 <encode_string>
>      456efe:       e9 dd 97 ff ff          jmp    4506e0 <encode_string>
>    
>     All these are calls now inline:
>    
>     encode_string                                 58       -     -58
>    
>     ... where for example encode_putfh() grows by:
>    
>     encode_putfh                                  70     118     +48

That still doesn't make it clear where the apparently ~10 instructions 
per inlining come from, right?

Thanks,

	Ingo