[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260119105017.262276b5@pumpkin>
Date: Mon, 19 Jan 2026 10:50:17 +0000
From: David Laight <david.laight.linux@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-kernel
<linux-kernel@...r.kernel.org>, netdev@...r.kernel.org, Jakub Kicinski
<kuba@...nel.org>, Eric Dumazet <eric.dumazet@...il.com>, Paolo Abeni
<pabeni@...hat.com>, Nicolas Pitre <npitre@...libre.com>
Subject: Re: [PATCH] compiler_types: Introduce inline_for_performance
On Mon, 19 Jan 2026 11:25:52 +0100
Eric Dumazet <edumazet@...gle.com> wrote:
> On Mon, Jan 19, 2026 at 10:33 AM David Laight
> <david.laight.linux@...il.com> wrote:
> >
> > On Sun, 18 Jan 2026 16:01:25 -0800
> > Andrew Morton <akpm@...ux-foundation.org> wrote:
> >
> > > On Sun, 18 Jan 2026 22:58:02 +0000 David Laight <david.laight.linux@...il.com> wrote:
> > >
> > > > > mm/ alone has 74 __always_inlines, none are documented, I don't know
> > > > > why they're present, many are probably wrong.
> > > > >
> > > > > Shit, uninlining only __get_user_pages_locked does this:
> > > > >
> > > > > text data bss dec hex filename
> > > > > 115703 14018 64 129785 1faf9 mm/gup.o
> > > > > 103866 13058 64 116988 1c8fc mm/gup.o-after
> > > >
> > > > The next questions are does anything actually run faster (either way),
> > > > and should anything at all be marked 'inline' rather than 'always_inline'.
> > > >
> > > > After all, if you call a function twice (not in a loop) you may
> > > > want a real function in order to avoid I-cache misses.
> > >
> > > yup
> >
> > I had two adjacent strlen() calls in a bit of code, the first was an
> > array (in a structure) and gcc inlined the 'word at a time' code, the
> > second was a pointer and it called the library function.
> > That had to be sub-optimal...
> >
> > > > But I'm sure there is a lot of code that is 'inline_for_bloat' :-)
> > >
> > > ooh, can we please have that?
> >
> > Or 'inline_to_speed_up_benchmark' and the associated 'unroll this loop
> > because that must make it faster'.
> >
> > > I do think that every always_inline should be justified and commented,
> > > but I haven't been energetic about asking for that.
> >
> > Apart from the 4-line functions where it is clearly obvious.
> > Especially since the compiler can still decide to not-inline them
> > if they are only 'inline'.
> >
> > > A fun little project would be go through each one, figure out whether
> > > were good reasons and if not, just remove them and see if anyone
> > > explains why that was incorrect.
> >
> > It's not just always_inline, a lot of the inline are dubious.
> > Probably why the networking code doesn't like it.
>
> Many __always_inline came because of clang's reluctance to inline
> small things, even if the resulting code size is bigger and slower.
>
> It is a bit unclear, this seems to happen when callers are 'big
> enough'. noinstr (callers) functions are also a problem.
>
> Let's take the list_add() call from dev_gro_receive() : clang does not
> inline it, for some reason.
>
> After adding __always_inline to list_add() and __list_add() we have
> smaller and more efficient code,
> for real workloads, not only benchmarks.
That falls into the '4-line function' category.
Where s/inline/always_inline/ makes sense.
> list_add 2212 - -2212
How many copies of list_add() is that... clearly a few.
Generating a real function for a 'static inline' in a header is stupid.
Pretty much the intent for those is to get them inlined.
I'm sure there was a suggestion to make inline mean 'always inline',
except there are places where it would just be bloat.
David
Powered by blists - more mailing lists