lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAf2crZau98tHFSn@gmail.com>
Date: Tue, 22 Apr 2025 22:05:06 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Uros Bizjak <ubizjak@...il.com>
Cc: Josh Poimboeuf <jpoimboe@...nel.org>, x86@...nel.org,
	linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v2] noinstr: Use asm_inline() in
 instrumentation_{begin,end}()


* Uros Bizjak <ubizjak@...il.com> wrote:

> > That still doesn't make it clear where the apparently ~10 
> > instructions per inlining come from, right?
> 
> The growth is actually from different inlining decisions, that cover 
> not only inlining of small functions, but other code blocks (hot vs. 
> cold, tail duplication, etc) too. The compiler uses certain 
> thresholds to estimate inlining gain (thresholds are different for 
> -Os and -O2). Artificially bloated functions that don't use 
> asm_inline() fall under this threshold (IOW, the inlining would 
> increase size too much), so they are not inlined; code blocks that 
> enclose unfixed asm clauses are treated differently than when they 
> use asm_inline() instead of asm(). When asm_inline() is introduced, 
> the size of the function (and consequently inlining gain) is 
> estimated more accurately, the estimated size is lower, so there is 
> more inlining happening.
> 
> I'd again remark that the code size is not the right metric when 
> compiling with -O2.

Understood, but still we somehow have to be able to measure whether the 
marking of these primitives with asm_inline() is beneficial in 
isolation - even if on a real build the noise of GCC's overall inlining 
decisions obscure the results - and may even reverse them.

Is there a way to coax GCC into a mode of build where such changes can 
be measured in a better fashion?

For example would setting -finline-limit=1000 or -finline-limit=10 (or 
some other well-chosen inlining threshold value, or tweaking any of the 
inliner parameters via --param values?), just for the sake of 
measurement, give us more representative .text size change values?

Because, ideally, if we do these decisions correctly at the asm() 
level, compilers will, eventually, after a few decades, catch up
and do the right thing as well. ;-)

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ