[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d38e5e1c-29b6-8cc6-7409-d0bdd5772f23@csgroup.eu>
Date: Thu, 17 Feb 2022 14:50:17 +0000
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: David Laight <David.Laight@...LAB.COM>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...hat.com>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Masahiro Yamada <yamada.masahiro@...ionext.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net v3] net: Force inlining of checksum functions in
net/checksum.h
Adding Ingo, Andrew and Nick as they were involved in the subjet,
Le 17/02/2022 à 14:36, David Laight a écrit :
> From: Christophe Leroy
>> Sent: 17 February 2022 12:19
>>
>> All functions defined as static inline in net/checksum.h are
>> meant to be inlined for performance reason.
>>
>> But since commit ac7c3e4ff401 ("compiler: enable
>> CONFIG_OPTIMIZE_INLINING forcibly") the compiler is allowed to
>> uninline functions when it wants.
>>
>> Fair enough in the general case, but for tiny performance critical
>> checksum helpers that's counter-productive.
>
> There isn't a real justification for allowing the compiler
> to 'not inline' functions in that commit.
Do you mean that the two following commits should be reverted:
- 889b3c1245de ("compiler: remove CONFIG_OPTIMIZE_INLINING entirely")
- 4c4e276f6491 ("net: Force inlining of checksum functions in
net/checksum.h")
>
> It rather seems backwards.
> The kernel sources don't really have anything marked 'inline'
> that shouldn't always be inlined.
> If there are any such functions they are few and far between.
>
> I've had enough trouble (elsewhere) getting gcc to inline
> static functions that are only called once.
> I ended up using 'always_inline'.
> (That is 4k of embedded object code that will be too slow
> if it ever spills a register to stack.)
>
I agree with you that that change is a nightmare with many small
functions that we really want inlined, and when we force inlining we
most of the time get a smaller binary.
And it becomes even more problematic when we start adding
instrumentation like stack protector.
According to the original commits however this was supposed to provide
real benefit:
- 60a3cdd06394 ("x86: add optimized inlining")
- 9012d011660e ("compiler: allow all arches to enable
CONFIG_OPTIMIZE_INLINING")
But when I build ppc64le_defconfig + CONFIG_CC_OPTIMISE_FOR_SIZE I get:
112 times queued_spin_unlock()
122 times mmiowb_spin_unlock()
151 times cpu_online()
225 times __raw_spin_unlock()
So I was wondering, would we have a way to force inlining of functions
marked inline in header files while leaving GCC handling the ones in C
files the way it wants ?
Christophe
Powered by blists - more mailing lists