[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL+tcoCsAJ+pOs6M2E2xRMa=VKNB9wwoR2DEbiFyvOmfe_dkGQ@mail.gmail.com>
Date: Tue, 27 Jan 2026 09:56:12 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Paolo Abeni <pabeni@...hat.com>, "David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, Simon Horman <horms@...nel.org>,
Kuniyuki Iwashima <kuniyu@...gle.com>, netdev@...r.kernel.org, eric.dumazet@...il.com,
Neal Cardwell <ncardwell@...gle.com>
Subject: Re: [PATCH net-next] tcp: move sk_forced_mem_schedule() to tcp.c
Hi Eric,
On Mon, Jan 26, 2026 at 4:45 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Mon, Jan 26, 2026 at 9:18 AM Paolo Abeni <pabeni@...hat.com> wrote:
> >
> > Hi Eric,
> >
> > On Fri, Jan 23, 2026 at 6:16 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > > TCP fast path can (auto)inline this helper, instead
> > of (auto)inling it from tcp_send_fin().
> > >
> > >> No change of overall code size, but tcp_sendmsg() is faster.
> > >
> > > $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
> > > add/remove: 0/0 grow/shrink: 1/1 up/down: 141/-140 (1)
> > > Function old new delta
> > > tcp_stream_alloc_skb 216 357 +141
> > > tcp_send_fin 688 548 -140
> > > Total: Before=22236729, After=22236730, chg +0.00%
> > >
> > > BTW, we might change tcp_send_fin() to use tcp_stream_alloc_skb().
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> >
> > I've a question out of sheer curiosity: are you using some specific tool
> > to look for inline opportunity, or "just" careful code and/or objdump
> > analysis?
Paolo, right, you beat me to it! I'm also curious :)
>
> I am studying performance profiles on a stress test using Google
> production kernels,
> on platforms that are a big chunk of the fleet.
> These kernels are very close to upstream (at least for core and TCP
> networking stacks), and use clang and FDO.
>
> I am currently focusing on some functions that even FDO does not
> inline, for some reasons.
Eric, could you share your more valuable experience and methodology behind this?
Prior to the recent work you've done, I completely missed that we
could even do something with the inline functions. Normally using perf
doesn't give us a clear micro observation of those functions
especially about some hints on whether we should inline some
functions. If there is any tool, it would be super awesome!
Let me guess, are you trying to avoid function call action as much as
you can in the hot path based on your deep understanding of
ingress/egress performance?
Thanks,
Jason
Powered by blists - more mailing lists