[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YxX9NjhQlppDUMkE@gmail.com>
Date: Mon, 5 Sep 2022 06:44:22 -0700
From: Breno Leitao <leitao@...ian.org>
To: Eric Dumazet <edumazet@...gle.com>
Cc: David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
netdev <netdev@...r.kernel.org>, leit@...com,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
Paolo Abeni <pabeni@...hat.com>,
David Ahern <dsahern@...nel.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RESEND net-next] tcp: socket-specific version of
WARN_ON_ONCE()
Hello Eric,
On Sat, Sep 03, 2022 at 09:42:43AM -0700, Eric Dumazet wrote:
> On Wed, Aug 31, 2022 at 6:38 AM Breno Leitao <leitao@...ian.org> wrote:
> >
> > There are cases where we need information about the socket during a
> > warning, so, it could help us to find bugs that happens that do not have
> > a easily repro.
> >
> > BPF congestion control algorithms can change socket state in unexpected
> > ways, leading to WARNings. Additional information about the socket state
> > is useful to identify the culprit.
A little bit of more context here. We hit this warning in production
several hundred times a day. We don't know exactly where it is coming
from, that is why this patch is being proposed.
> Well, this suggests we need to fix BPF side ?
This patch might help us to identify who is the culprit that is setting
the wrong value in the congestion window. If the problem is on the BPF
side, we probably need to Fix BPF side, for sure.
> It seems you already found the issue in an eBPF CC, can you share the details ?
Not really. I've applied this patch into our internal kernel, and we
might soon find more information of what is causing this warning.
> > This diff creates a TCP socket-specific version of WARN_ON_ONCE(), and
> > attaches it to tcp_snd_cwnd_set().
>
> Well, I feel this will need constant additions... the state of a
> custom BPF CC is opaque to core TCP stack anyway ?
>
> >
> > Signed-off-by: Breno Leitao <leitao@...ian.org>
> > ---
> > include/net/tcp.h | 3 ++-
> > include/net/tcp_debug.h | 10 ++++++++++
> > net/ipv4/tcp.c | 30 ++++++++++++++++++++++++++++++
> > 3 files changed, 42 insertions(+), 1 deletion(-)
> > create mode 100644 include/net/tcp_debug.h
> >
> > diff --git a/include/net/tcp.h b/include/net/tcp.h
> > index d10962b9f0d0..73c3970d8839 100644
> > --- a/include/net/tcp.h
> > +++ b/include/net/tcp.h
> > @@ -40,6 +40,7 @@
> > #include <net/inet_ecn.h>
> > #include <net/dst.h>
> > #include <net/mptcp.h>
> > +#include <net/tcp_debug.h>
> >
> > #include <linux/seq_file.h>
> > #include <linux/memcontrol.h>
> > @@ -1222,7 +1223,7 @@ static inline u32 tcp_snd_cwnd(const struct tcp_sock *tp)
> >
> > static inline void tcp_snd_cwnd_set(struct tcp_sock *tp, u32 val)
> > {
> > - WARN_ON_ONCE((int)val <= 0);
> > + TCP_SOCK_WARN_ON_ONCE(tp, (int)val <= 0);
> > tp->snd_cwnd = val;
> > }
> >
> > diff --git a/include/net/tcp_debug.h b/include/net/tcp_debug.h
> > new file mode 100644
> > index 000000000000..50e96d87d335
> > --- /dev/null
> > +++ b/include/net/tcp_debug.h
> > @@ -0,0 +1,10 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _LINUX_TCP_DEBUG_H
> > +#define _LINUX_TCP_DEBUG_H
> > +
> > +void tcp_sock_warn(const struct tcp_sock *tp);
> > +
> > +#define TCP_SOCK_WARN_ON_ONCE(tcp_sock, condition) \
> > + DO_ONCE_LITE_IF(condition, tcp_sock_warn, tcp_sock)
> > +
> > +#endif /* _LINUX_TCP_DEBUG_H */
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index bbe218753662..71771fee72f7 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -4684,6 +4684,36 @@ int tcp_abort(struct sock *sk, int err)
> > }
> > EXPORT_SYMBOL_GPL(tcp_abort);
> >
> > +void tcp_sock_warn(const struct tcp_sock *tp)
> > +{
> > + const struct sock *sk = (const struct sock *)tp;
> > + struct inet_sock *inet = inet_sk(sk);
> > + struct inet_connection_sock *icsk = inet_csk(sk);
> > +
> > + WARN_ON(1);
> > +
> > + if (!tp)
> > + return;
> > +
> > + pr_warn("Socket Info: family=%u state=%d sport=%u dport=%u ccname=%s cwnd=%u",
> > + sk->sk_family, sk->sk_state, ntohs(inet->inet_sport),
> > + ntohs(inet->inet_dport), icsk->icsk_ca_ops->name, tcp_snd_cwnd(tp));
> > +
> > + switch (sk->sk_family) {
> > + case AF_INET:
> > + pr_warn("saddr=%pI4 daddr=%pI4", &inet->inet_saddr,
> > + &inet->inet_daddr);
> > + break;
> > +#if IS_ENABLED(CONFIG_IPV6)
> > + case AF_INET6:
> > + pr_warn("saddr=%pI6 daddr=%pI6", &sk->sk_v6_rcv_saddr,
> > + &sk->sk_v6_daddr);
> > + break;
> > +#endif
> > + }
> > +}
> > +EXPORT_SYMBOL_GPL(tcp_sock_warn);
> > +
> > extern struct tcp_congestion_ops tcp_reno;
> >
> > static __initdata unsigned long thash_entries;
> > --
> > 2.30.2
> >
Powered by blists - more mailing lists