[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ea50944f-8c64-4ceb-9d2c-70e4d9b38120@kernel.org>
Date: Thu, 22 May 2025 17:01:37 +0200
From: Matthieu Baerts <matttbe@...nel.org>
To: Kuniyuki Iwashima <kuniyu@...zon.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Willem de Bruijn <willemb@...gle.com>
Cc: Simon Horman <horms@...nel.org>, Kuniyuki Iwashima <kuni1840@...il.com>,
netdev@...r.kernel.org
Subject: Re: [PATCH v1 net-next 2/6] socket: Rename sock_create_kern() to
__sock_create_kern().
Hi Kuniyuki,
On 17/05/2025 05:50, Kuniyuki Iwashima wrote:
> sock_create_kern() is a catchy name and often chosen by non-networking
> developers to create kernel sockets. But due to its poor documentation,
> it has caused a bunch of netns use-after-free:
>
> * commit ef7134c7fc48 ("smb: client: Fix use-after-free of network
> namespace.")
> * commit b013b817f32f ("nvme-tcp: fix use-after-free of netns by
> kernel TCP socket.")
> .. and more in NFS, SMC, MPTCP, RDS
>
> Some non-networking maintainers mentioned that the socket API should
> be more robust to prevent this type of issues. [0]
>
> The current sock_create_kern() doesn't hold a reference to the netns,
> which allows the netns to be removed while the socket is still around.
> This is useful when the socket is used as the backend for a networking
> device.
>
> But, this is rather a special case, where netdev folks should use a
> dedicated API, and we should provide sock_create_kern() as the standard
> API for general in-kernel use cases.
>
> In fact, we did so before commit 26abe14379f8 ("net: Modify sk_alloc
> to not reference count the netns of kernel sockets."),
>
> sock_create_kern(&init_net, ..., &sock)
> sk_change_net(sock->sk, net);
>
> but that implicit API change ended up causing a lot of problems.
>
> Let's rename sock_create_kern() to __sock_create_kern() as a special
> API and add a fat documentation.
>
> The next patch will add sock_create_kern() that holds netns refcnt.
Thank you for clarifying this!
(...)
> diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c
> index d39e7c178460..a7467497de0f 100644
> --- a/net/mptcp/pm_kernel.c
> +++ b/net/mptcp/pm_kernel.c
> @@ -637,8 +637,8 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk,
> int backlog = 1024;
> int err;
>
> - err = sock_create_kern(sock_net(sk), entry->addr.family,
> - SOCK_STREAM, IPPROTO_MPTCP, &entry->lsk);
> + err = __sock_create_kern(sock_net(sk), entry->addr.family,
> + SOCK_STREAM, IPPROTO_MPTCP, &entry->lsk);
> if (err)
> return err;
>
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 15613d691bfe..602e689e991f 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -1757,7 +1757,7 @@ int mptcp_subflow_create_socket(struct sock *sk, unsigned short family,
> if (unlikely(!sk->sk_socket))
> return -EINVAL;
>
> - err = sock_create_kern(net, family, SOCK_STREAM, IPPROTO_TCP, &sf);
> + err = __sock_create_kern(net, family, SOCK_STREAM, IPPROTO_TCP, &sf);
> if (err)
> return err;
>
> @@ -1948,7 +1948,7 @@ static int subflow_ulp_init(struct sock *sk)
> int err = 0;
>
> /* disallow attaching ULP to a socket unless it has been
> - * created with sock_create_kern()
> + * created with __sock_create_kern()
> */
> if (!sk->sk_kern_sock) {
> err = -EOPNOTSUPP;
For the changes in MPTCP:
Acked-by: Matthieu Baerts (NGI0) <matttbe@...nel.org> # net/mptcp
(...)
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
Powered by blists - more mailing lists