[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b722c37528e6f94bef828d6ca478a9fa8d33501a@linux.dev>
Date: Thu, 23 Oct 2025 14:38:48 +0000
From: "Jiayuan Chen" <jiayuan.chen@...ux.dev>
To: "Matthieu Baerts" <matttbe@...nel.org>, mptcp@...ts.linux.dev
Cc: stable@...r.kernel.org, "Jakub Sitnicki" <jakub@...udflare.com>, "John
Fastabend" <john.fastabend@...il.com>, "Eric Dumazet"
<edumazet@...gle.com>, "Kuniyuki Iwashima" <kuniyu@...gle.com>, "Paolo
Abeni" <pabeni@...hat.com>, "Willem de Bruijn" <willemb@...gle.com>,
"David S. Miller" <davem@...emloft.net>, "Jakub Kicinski"
<kuba@...nel.org>, "Simon Horman" <horms@...nel.org>, "Mat Martineau"
<martineau@...nel.org>, "Geliang Tang" <geliang@...nel.org>, "Andrii
Nakryiko" <andrii@...nel.org>, "Eduard Zingerman" <eddyz87@...il.com>,
"Alexei Starovoitov" <ast@...nel.org>, "Daniel Borkmann"
<daniel@...earbox.net>, "Martin KaFai Lau" <martin.lau@...ux.dev>, "Song
Liu" <song@...nel.org>, "Yonghong Song" <yonghong.song@...ux.dev>, "KP
Singh" <kpsingh@...nel.org>, "Stanislav Fomichev" <sdf@...ichev.me>, "Hao
Luo" <haoluo@...gle.com>, "Jiri Olsa" <jolsa@...nel.org>, "Shuah Khan"
<shuah@...nel.org>, "Florian Westphal" <fw@...len.de>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
bpf@...r.kernel.org, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH net v3 1/3] net,mptcp: fix proto fallback detection with
BPF sockmap
October 23, 2025 at 22:10, "Matthieu Baerts" <matttbe@...nel.org mailto:matttbe@...nel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel.org%3E > wrote:
>
> Hi Jiayuan,
>
> On 23/10/2025 14:54, Jiayuan Chen wrote:
>
> >
> > When the server has MPTCP enabled but receives a non-MP-capable request
> > from a client, it calls mptcp_fallback_tcp_ops().
> >
> > Since non-MPTCP connections are allowed to use sockmap, which replaces
> > sk->sk_prot, using sk->sk_prot to determine the IP version in
> > mptcp_fallback_tcp_ops() becomes unreliable. This can lead to assigning
> > incorrect ops to sk->sk_socket->ops.
> >
> > Additionally, when BPF Sockmap modifies the protocol handlers, the
> > original WARN_ON_ONCE(sk->sk_prot != &tcp_prot) check would falsely
> > trigger warnings.
> >
> > Fix this by using the more stable sk_family to distinguish between IPv4
> > and IPv6 connections, ensuring correct fallback protocol operations are
> > selected even when BPF Sockmap has modified the socket protocol handlers.
> >
> > Fixes: 0b4f33def7bb ("mptcp: fix tcp fallback crash")
> > Cc: <stable@...r.kernel.org>
> > Signed-off-by: Jiayuan Chen <jiayuan.chen@...ux.dev>
> > Reviewed-by: Jakub Sitnicki <jakub@...udflare.com>
> > ---
> > net/mptcp/protocol.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> > index 0292162a14ee..2393741bc310 100644
> > --- a/net/mptcp/protocol.c
> > +++ b/net/mptcp/protocol.c
> > @@ -61,11 +61,16 @@ static u64 mptcp_wnd_end(const struct mptcp_sock *msk)
> >
> > static const struct proto_ops *mptcp_fallback_tcp_ops(const struct sock *sk)
> > {
> > + /* When BPF sockmap is used, it may replace sk->sk_prot.
> > + * Using sk_family is a reliable way to determine the IP version.
> > + */
> > + unsigned short family = READ_ONCE(sk->sk_family);
> > +
> > #if IS_ENABLED(CONFIG_MPTCP_IPV6)
> > - if (sk->sk_prot == &tcpv6_prot)
> > + if (family == AF_INET6)
> > return &inet6_stream_ops;
> > #endif
> > - WARN_ON_ONCE(sk->sk_prot != &tcp_prot);
> > + WARN_ON_ONCE(family != AF_INET);
> > return &inet_stream_ops;
> >
> Just to be sure: is there anything in BPF modifying sk->sk_socket->ops?
> Because that's what mptcp_fallback_tcp_ops() will do somehow.
>
> In other words, is it always fine to set inet(6)_stream_ops? (I guess
> yes, but better to be sure while we are looking at that :) )
Hi Matt,
I can confirm that on the BPF side, the only special operations targeting
sockets currently are sockmap/sockhash. Their implementations do not modify
sk->sk_socket->ops. Currently, they only modify sk->prot, because the BPF
side typically operates on 'struct sock' and does not concern itself with
'struct socket'.
Therefore, setting inet(6)_stream_ops is fine.
Thanks,
Jiayuan
> >
> > }
> >
> Cheers,
> Matt
> --
> Sponsored by the NGI0 Core fund.
>
Powered by blists - more mailing lists