[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALOAHbDLQhU5c+P_TiOQb9mjpgJkdfK21WemydVja+eyC0DkRA@mail.gmail.com>
Date: Tue, 31 Dec 2024 19:49:21 +0800
From: Yafang Shao <laoar.shao@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: davem@...emloft.net, dsahern@...nel.org, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, netdev@...r.kernel.org
Subject: Re: [PATCH net-next] net: tcp: Define tcp_listendrop() as noinline
On Tue, Dec 31, 2024 at 5:26 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Tue, Dec 31, 2024 at 6:52 AM Yafang Shao <laoar.shao@...il.com> wrote:
> >
> > The LINUX_MIB_LISTENDROPS counter can be incremented for several reasons,
> > such as:
> > - A full SYN backlog queue
> > - SYN flood attacks
> > - Memory exhaustion
> > - Other resource constraints
> >
> > Recently, one of our services experienced frequent ListenDrops. While
> > attempting to trace the root cause, we discovered that tracing the function
> > tcp_listendrop() was not straightforward because it is currently inlined.
> > To overcome this, we had to create a livepatch that defined a non-inlined
> > version of the function, which we then traced using BPF programs.
> >
> > $ grep tcp_listendrop /proc/kallsyms
> > ffffffffc093fba0 t tcp_listendrop_x [livepatch_tmp]
> >
> > Through this process, we eventually determined that the ListenDrops were
> > caused by SYN attacks.
> >
> > Since tcp_listendrop() is not part of the critical path, defining it as
> > noinline would make it significantly easier to trace and diagnose issues
> > without requiring workarounds such as livepatching. This function can be
> > used by kernel modules like smc, so export it with EXPORT_SYMBOL_GPL().
> >
> > After that change, the result is as follows,
> >
> > $ grep tcp_listendrop /proc/kallsyms
> > ffffffffb718eaa0 T __pfx_tcp_listendrop
> > ffffffffb718eab0 T tcp_listendrop
> > ffffffffb7e636b8 r __ksymtab_tcp_listendrop
> >
> > Signed-off-by: Yafang Shao <laoar.shao@...il.com>
> > --
>
> Are we going to accept one patch at a time to un-inline all TCP
> related functions in the kernel ?
I don't have an in-depth understanding of the TCP/IP stack, so I can't
consolidate all related TCP functions in the error paths into a single
patch. I would greatly appreciate it if you could help ensure these
functions are marked as non-inlined based on your expertise. If you
don't have time to do it directly, could you provide some guidance?
The inlining of TCP functions in error paths often complicates tracing
efforts. For instance, we recently encountered issues with the inlined
tcp_write_err(), although we were fortunate to have an alternative
tracepoint available: inet_sk_error_report.
Unfortunately, no such alternative exists for tcp_listendrop(), which
is why I submitted this patch.
>
> My understanding is that current tools work fine. You may need to upgrade yours.
>
> # perf probe -a tcp_listendrop
> Added new events:
> probe:tcp_listendrop (on tcp_listendrop)
> probe:tcp_listendrop (on tcp_listendrop)
> probe:tcp_listendrop (on tcp_listendrop)
> probe:tcp_listendrop (on tcp_listendrop)
> probe:tcp_listendrop (on tcp_listendrop)
> probe:tcp_listendrop (on tcp_listendrop)
> probe:tcp_listendrop (on tcp_listendrop)
> probe:tcp_listendrop (on tcp_listendrop)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:tcp_listendrop -aR sleep 1
The issue is that we can't extract the struct `sock *sk` from
tcp_listendrop() using perf. Please correct me if I’m mistaken.
In a containerized environment, it's common to have many pods running
on a single host, each assigned a unique IP. Our goal is to filter for
a specific local_ip:local_port combination while also capturing the
corresponding remote IP. This task is straightforward with bpftrace or
other BPF-based tools, but I’m unsure how to accomplish it using perf.
--
Regards
Yafang
Powered by blists - more mailing lists