linux-kernel - Re: [PATCH v2] noinstr: Use asm_inline() in instrumentation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aAf2crZau98tHFSn@gmail.com>
Date: Tue, 22 Apr 2025 22:05:06 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Uros Bizjak <ubizjak@...il.com>
Cc: Josh Poimboeuf <jpoimboe@...nel.org>, x86@...nel.org,
	linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v2] noinstr: Use asm_inline() in
 instrumentation_{begin,end}()

* Uros Bizjak <ubizjak@...il.com> wrote:

> > That still doesn't make it clear where the apparently ~10 
> > instructions per inlining come from, right?
> 
> The growth is actually from different inlining decisions, that cover 
> not only inlining of small functions, but other code blocks (hot vs. 
> cold, tail duplication, etc) too. The compiler uses certain 
> thresholds to estimate inlining gain (thresholds are different for 
> -Os and -O2). Artificially bloated functions that don't use 
> asm_inline() fall under this threshold (IOW, the inlining would 
> increase size too much), so they are not inlined; code blocks that 
> enclose unfixed asm clauses are treated differently than when they 
> use asm_inline() instead of asm(). When asm_inline() is introduced, 
> the size of the function (and consequently inlining gain) is 
> estimated more accurately, the estimated size is lower, so there is 
> more inlining happening.
> 
> I'd again remark that the code size is not the right metric when 
> compiling with -O2.

Understood, but still we somehow have to be able to measure whether the 
marking of these primitives with asm_inline() is beneficial in 
isolation - even if on a real build the noise of GCC's overall inlining 
decisions obscure the results - and may even reverse them.

Is there a way to coax GCC into a mode of build where such changes can 
be measured in a better fashion?

For example would setting -finline-limit=1000 or -finline-limit=10 (or 
some other well-chosen inlining threshold value, or tweaking any of the 
inliner parameters via --param values?), just for the sake of 
measurement, give us more representative .text size change values?

Because, ideally, if we do these decisions correctly at the asm() 
level, compilers will, eventually, after a few decades, catch up
and do the right thing as well. ;-)

Thanks,

	Ingo