[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210512143105.GW10366@gate.crashing.org>
Date: Wed, 12 May 2021 09:31:05 -0500
From: Segher Boessenkool <segher@...nel.crashing.org>
To: Christophe Leroy <christophe.leroy@...roup.eu>
Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] powerpc: Force inlining of csum_add()
On Wed, May 12, 2021 at 02:56:56PM +0200, Christophe Leroy wrote:
> Le 11/05/2021 à 12:51, Segher Boessenkool a écrit :
> >Something seems to have decided this asm is more expensive than it is.
> >That isn't always avoidable -- the compiler cannot look inside asms --
> >but it seems it could be improved here.
> >
> >Do you have (or can make) a self-contained testcase?
>
> I have not tried, and I fear it might be difficult, because on a kernel
> build with dozens of calls to csum_add(), only ip6_tunnel.o exhibits such
> an issue.
Yeah. Sometimes you can force some of the decisions, but that usually
requires knowing too many GCC internals :-/
> >>And there is even one completely unused instance of csum_add().
> >
> >That is strange, that should never happen.
>
> It seems that several .o include unused versions of csum_add. After the
> final link, one remains (in addition to the used one) in vmlinux.
But it is a static function, so it should not end up in any object file
where it isn't used.
> >>In the non-inlined version, the first sum with 0 was performed.
> >>Here it is skipped.
> >
> >That is because of how __builtin_constant_p works, most likely. As we
> >discussed elsewhere it is evaluated before all forms of loop unrolling.
>
> But we are not talking about loop unrolling here, are we ?
Oh, right you are, but that doesn't change much. The
_builtin_constant_p(len) is evaluated long before the compiler sees len
is a constant here.
> It seems that the reason here is that __builtin_constant_p() is evaluated
> long after GCC decided to not inline that call to csum_add().
Yes, it seems we do not currently do even trivial inlining except very
early in the compiler.
Thanks,
Segher
Powered by blists - more mailing lists