[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAf2crZau98tHFSn@gmail.com>
Date: Tue, 22 Apr 2025 22:05:06 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Uros Bizjak <ubizjak@...il.com>
Cc: Josh Poimboeuf <jpoimboe@...nel.org>, x86@...nel.org,
linux-kernel@...r.kernel.org, Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v2] noinstr: Use asm_inline() in
instrumentation_{begin,end}()
* Uros Bizjak <ubizjak@...il.com> wrote:
> > That still doesn't make it clear where the apparently ~10
> > instructions per inlining come from, right?
>
> The growth is actually from different inlining decisions, that cover
> not only inlining of small functions, but other code blocks (hot vs.
> cold, tail duplication, etc) too. The compiler uses certain
> thresholds to estimate inlining gain (thresholds are different for
> -Os and -O2). Artificially bloated functions that don't use
> asm_inline() fall under this threshold (IOW, the inlining would
> increase size too much), so they are not inlined; code blocks that
> enclose unfixed asm clauses are treated differently than when they
> use asm_inline() instead of asm(). When asm_inline() is introduced,
> the size of the function (and consequently inlining gain) is
> estimated more accurately, the estimated size is lower, so there is
> more inlining happening.
>
> I'd again remark that the code size is not the right metric when
> compiling with -O2.
Understood, but still we somehow have to be able to measure whether the
marking of these primitives with asm_inline() is beneficial in
isolation - even if on a real build the noise of GCC's overall inlining
decisions obscure the results - and may even reverse them.
Is there a way to coax GCC into a mode of build where such changes can
be measured in a better fashion?
For example would setting -finline-limit=1000 or -finline-limit=10 (or
some other well-chosen inlining threshold value, or tweaking any of the
inliner parameters via --param values?), just for the sake of
measurement, give us more representative .text size change values?
Because, ideally, if we do these decisions correctly at the asm()
level, compilers will, eventually, after a few decades, catch up
and do the right thing as well. ;-)
Thanks,
Ingo
Powered by blists - more mailing lists