lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ea50944f-8c64-4ceb-9d2c-70e4d9b38120@kernel.org>
Date: Thu, 22 May 2025 17:01:37 +0200
From: Matthieu Baerts <matttbe@...nel.org>
To: Kuniyuki Iwashima <kuniyu@...zon.com>,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 Willem de Bruijn <willemb@...gle.com>
Cc: Simon Horman <horms@...nel.org>, Kuniyuki Iwashima <kuni1840@...il.com>,
 netdev@...r.kernel.org
Subject: Re: [PATCH v1 net-next 2/6] socket: Rename sock_create_kern() to
 __sock_create_kern().

Hi Kuniyuki,

On 17/05/2025 05:50, Kuniyuki Iwashima wrote:
> sock_create_kern() is a catchy name and often chosen by non-networking
> developers to create kernel sockets.  But due to its poor documentation,
> it has caused a bunch of netns use-after-free:
> 
>   * commit ef7134c7fc48 ("smb: client: Fix use-after-free of network
>      namespace.")
>   * commit b013b817f32f ("nvme-tcp: fix use-after-free of netns by
>      kernel TCP socket.")
>   .. and more in NFS, SMC, MPTCP, RDS
> 
> Some non-networking maintainers mentioned that the socket API should
> be more robust to prevent this type of issues. [0]
> 
> The current sock_create_kern() doesn't hold a reference to the netns,
> which allows the netns to be removed while the socket is still around.
> This is useful when the socket is used as the backend for a networking
> device.
> 
> But, this is rather a special case, where netdev folks should use a
> dedicated API, and we should provide sock_create_kern() as the standard
> API for general in-kernel use cases.
> 
> In fact, we did so before commit 26abe14379f8 ("net: Modify sk_alloc
> to not reference count the netns of kernel sockets."),
> 
>   sock_create_kern(&init_net, ..., &sock)
>   sk_change_net(sock->sk, net);
> 
> but that implicit API change ended up causing a lot of problems.
> 
> Let's rename sock_create_kern() to __sock_create_kern() as a special
> API and add a fat documentation.
> 
> The next patch will add sock_create_kern() that holds netns refcnt.

Thank you for clarifying this!

(...)

> diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c
> index d39e7c178460..a7467497de0f 100644
> --- a/net/mptcp/pm_kernel.c
> +++ b/net/mptcp/pm_kernel.c
> @@ -637,8 +637,8 @@ static int mptcp_pm_nl_create_listen_socket(struct sock *sk,
>  	int backlog = 1024;
>  	int err;
>  
> -	err = sock_create_kern(sock_net(sk), entry->addr.family,
> -			       SOCK_STREAM, IPPROTO_MPTCP, &entry->lsk);
> +	err = __sock_create_kern(sock_net(sk), entry->addr.family,
> +				 SOCK_STREAM, IPPROTO_MPTCP, &entry->lsk);
>  	if (err)
>  		return err;
>  
> diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c
> index 15613d691bfe..602e689e991f 100644
> --- a/net/mptcp/subflow.c
> +++ b/net/mptcp/subflow.c
> @@ -1757,7 +1757,7 @@ int mptcp_subflow_create_socket(struct sock *sk, unsigned short family,
>  	if (unlikely(!sk->sk_socket))
>  		return -EINVAL;
>  
> -	err = sock_create_kern(net, family, SOCK_STREAM, IPPROTO_TCP, &sf);
> +	err = __sock_create_kern(net, family, SOCK_STREAM, IPPROTO_TCP, &sf);
>  	if (err)
>  		return err;
>  
> @@ -1948,7 +1948,7 @@ static int subflow_ulp_init(struct sock *sk)
>  	int err = 0;
>  
>  	/* disallow attaching ULP to a socket unless it has been
> -	 * created with sock_create_kern()
> +	 * created with __sock_create_kern()
>  	 */
>  	if (!sk->sk_kern_sock) {
>  		err = -EOPNOTSUPP;
For the changes in MPTCP:

Acked-by: Matthieu Baerts (NGI0) <matttbe@...nel.org>  # net/mptcp

(...)

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ