lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXDYfYy3f1NQm5A0@sgarzare-redhat>
Date: Wed, 21 Jan 2026 15:48:13 +0100
From: Stefano Garzarella <sgarzare@...hat.com>
To: Bobby Eshleman <bobbyeshleman@...il.com>
Cc: "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, 
	Stefan Hajnoczi <stefanha@...hat.com>, "Michael S. Tsirkin" <mst@...hat.com>, 
	Jason Wang <jasowang@...hat.com>, Eugenio Pérez <eperezma@...hat.com>, 
	Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, "K. Y. Srinivasan" <kys@...rosoft.com>, 
	Haiyang Zhang <haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>, 
	Bryan Tan <bryan-bt.tan@...adcom.com>, Vishnu Dasa <vishnu.dasa@...adcom.com>, 
	Broadcom internal kernel review list <bcm-kernel-feedback-list@...adcom.com>, Shuah Khan <shuah@...nel.org>, Long Li <longli@...rosoft.com>, 
	Jonathan Corbet <corbet@....net>, linux-kernel@...r.kernel.org, virtualization@...ts.linux.dev, 
	netdev@...r.kernel.org, kvm@...r.kernel.org, linux-hyperv@...r.kernel.org, 
	linux-kselftest@...r.kernel.org, berrange@...hat.com, Sargun Dhillon <sargun@...gun.me>, 
	linux-doc@...r.kernel.org, Bobby Eshleman <bobbyeshleman@...a.com>
Subject: Re: [PATCH net-next v15 01/12] vsock: add netns to vsock core

On Fri, Jan 16, 2026 at 01:28:41PM -0800, Bobby Eshleman wrote:
>From: Bobby Eshleman <bobbyeshleman@...a.com>
>
>Add netns logic to vsock core. Additionally, modify transport hook
>prototypes to be used by later transport-specific patches (e.g.,
>*_seqpacket_allow()).
>
>Namespaces are supported primarily by changing socket lookup functions
>(e.g., vsock_find_connected_socket()) to take into account the socket
>namespace and the namespace mode before considering a candidate socket a
>"match".
>
>This patch also introduces the sysctl /proc/sys/net/vsock/ns_mode to
>report the mode and /proc/sys/net/vsock/child_ns_mode to set the mode
>for new namespaces.
>
>Add netns functionality (initialization, passing to transports, procfs,
>etc...) to the af_vsock socket layer. Later patches that add netns
>support to transports depend on this patch.

nit: maybe we should mention here why we changed the random port 
allocation

(not a big deal, only if you need to resend)

>
>dgram_allow(), stream_allow(), and seqpacket_allow() callbacks are
>modified to take a vsk in order to perform logic on namespace modes. In
>future patches, the net will also be used for socket
>lookups in these functions.
>
>Signed-off-by: Bobby Eshleman <bobbyeshleman@...a.com>
>---
>Changes in v15:
>- make static port in __vsock_bind_connectible per-netns
>- remove __net_initdata because we want the ops beyond just boot
>- add vsock_init_ns_mode kernel cmdline parameter to set init ns mode
>- use if (ret || !write) in __vsock_net_mode_string() (Stefano)
>- add vsock_net_mode_global() (Stefano)
>- hide !net == VSOCK_NET_MODE_GLOBAL inside vsock_net_mode() (Stefano)
>- clarify af_vsock.c comments on ns_mode/child_ns_mode (Stefano)
>
>Changes in v14:
>- include linux/sysctl.h in af_vsock.c
>- squash patch 'vsock: add per-net vsock NS mode state' into this patch
>  (prior version can be found here):
>  https://lore.kernel.org/all/20251223-vsock-vmtest-v13-1-9d6db8e7c80b@meta.com/)
>
>Changes in v13:
>- remove net_mode and replace with direct accesses to net->vsock.mode,
>  since this is now immutable.
>- update comments about mode behavior and mutability, and sysctl API
>- only pass NULL for net when wanting global, instead of net_mode ==
>  VSOCK_NET_MODE_GLOBAL. This reflects the new logic
>  of vsock_net_check_mode() that only requires net pointers (not
>  net_mode).
>- refactor sysctl string code into a re-usable function, because
>  child_ns_mode and ns_mode both handle the same strings.
>- remove redundant vsock_net_init(&init_net) call in module init because
>  pernet registration calls the callback on the init_net too
>
>Changes in v12:
>- return true in dgram_allow(), stream_allow(), and seqpacket_allow()
>  only if net_mode == VSOCK_NET_MODE_GLOBAL (Stefano)
>- document bind(VMADDR_CID_ANY) case in af_vsock.c (Stefano)
>- change order of stream_allow() call in vmci so we can pass vsk
>  to it
>
>Changes in v10:
>- add file-level comment about what happens to sockets/devices
>  when the namespace mode changes (Stefano)
>- change the 'if (write)' boolean in vsock_net_mode_string() to
>  if (!write), this simplifies a later patch which adds "goto"
>  for mutex unlocking on function exit.
>
>Changes in v9:
>- remove virtio_vsock_alloc_rx_skb() (Stefano)
>- remove vsock_global_dummy_net, not needed as net=NULL +
>  net_mode=VSOCK_NET_MODE_GLOBAL achieves identical result
>
>Changes in v7:
>- hv_sock: fix hyperv build error
>- explain why vhost does not use the dummy
>- explain usage of __vsock_global_dummy_net
>- explain why VSOCK_NET_MODE_STR_MAX is 8 characters
>- use switch-case in vsock_net_mode_string()
>- avoid changing transports as much as possible
>- add vsock_find_{bound,connected}_socket_net()
>- rename `vsock_hdr` to `sysctl_hdr`
>- add virtio_vsock_alloc_linear_skb() wrapper for setting dummy net and
>  global mode for virtio-vsock, move skb->cb zero-ing into wrapper
>- explain seqpacket_allow() change
>- move net setting to __vsock_create() instead of vsock_create() so
>  that child sockets also have their net assigned upon accept()
>
>Changes in v6:
>- unregister sysctl ops in vsock_exit()
>- af_vsock: clarify description of CID behavior
>- af_vsock: fix buf vs buffer naming, and length checking
>- af_vsock: fix length checking w/ correct ctl_table->maxlen
>
>Changes in v5:
>- vsock_global_net() -> vsock_global_dummy_net()
>- update comments for new uAPI
>- use /proc/sys/net/vsock/ns_mode instead of /proc/net/vsock_ns_mode
>- add prototype changes so patch remains c)mpilable
>---
> Documentation/admin-guide/kernel-parameters.txt |  14 +
> MAINTAINERS                                     |   1 +
> drivers/vhost/vsock.c                           |   6 +-
> include/linux/virtio_vsock.h                    |   4 +-
> include/net/af_vsock.h                          |  61 ++++-
> include/net/net_namespace.h                     |   4 +
> include/net/netns/vsock.h                       |  21 ++
> net/vmw_vsock/af_vsock.c                        | 328 ++++++++++++++++++++++--
> net/vmw_vsock/hyperv_transport.c                |   7 +-
> net/vmw_vsock/virtio_transport.c                |   9 +-
> net/vmw_vsock/virtio_transport_common.c         |   6 +-
> net/vmw_vsock/vmci_transport.c                  |  26 +-
> net/vmw_vsock/vsock_loopback.c                  |   8 +-
> 13 files changed, 444 insertions(+), 51 deletions(-)
>
>diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>index a8d0afde7f85..b6e3bfe365a1 100644
>--- a/Documentation/admin-guide/kernel-parameters.txt
>+++ b/Documentation/admin-guide/kernel-parameters.txt
>@@ -8253,6 +8253,20 @@ Kernel parameters
> 			            them quite hard to use for exploits but
> 			            might break your system.
>
>+	vsock_init_ns_mode=
>+			[KNL,NET] Set the vsock namespace mode for the init
>+			(root) network namespace.
>+
>+			global      [default] The init namespace operates in
>+			            global mode where CIDs are system-wide and
>+			            sockets can communicate across global
>+			            namespaces.
>+
>+			local       The init namespace operates in local mode
>+			            where CIDs are private to the namespace and
>+			            sockets can only communicate within the same
>+			            namespace.
>+

My comment on v14 was more to start a discussion :-) sorry to not be 
clear.

I briefly discussed it with Paolo in chat to better understand our 
policy between cmdline parameters and module parameters, and it seems 
that both are discouraged.

So he asked me if we have a use case for this, and thinking about it, I 
don't have one at the moment. Also, if a user decides to set all netns 
to local, whether init_net is local or global doesn't really matter, 
right?

So perhaps before adding this, we should have a real use case.
Perhaps more than this feature, I would add a way to change the default 
of all netns (including init_net) from global to local. But we can do 
that later, since all netns have a way to understand what mode they are 
in, so we don't break anything and the user has to explicitly change it, 
knowing that they are breaking compatibility with pre-netns support.\


That said, at this point, maybe we can remove this, documenting that 
init_net is always global, and if we have a use case in the future, we 
can add this (or something else) to set the init_net mode (or change the 
default for all netns).

Let's wait a bit before next version to wait a comment from Paolo or 
Jakub on this. But I'm almost fine with both ways, so:

Reviewed-by: Stefano Garzarella <sgarzare@...hat.com>

> 	vt.color=	[VT] Default text color.
> 			Format: 0xYX, X = foreground, Y = background.
> 			Default: 0x07 = light gray on black.

[...]

>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index a3505a4dcee0..3fc8160d51df 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c

[...]

>@@ -235,33 +303,42 @@ static void __vsock_remove_connected(struct 
>vsock_sock *vsk)
> 	sock_put(&vsk->sk);
> }
>

In the v14 I suggested to add some documentation on top of the 
vsock_find*() vs vsock_find_*_net() to explain better which one should 
be used by transports.

Again is not a big deal, we can fix later if you don't need to resend.

Thanks,
Stefano

>-static struct sock *__vsock_find_bound_socket(struct sockaddr_vm *addr)
>+static struct sock *__vsock_find_bound_socket_net(struct sockaddr_vm *addr,
>+						  struct net *net)
> {
> 	struct vsock_sock *vsk;
>
> 	list_for_each_entry(vsk, vsock_bound_sockets(addr), bound_table) {
>-		if (vsock_addr_equals_addr(addr, &vsk->local_addr))
>-			return sk_vsock(vsk);
>+		struct sock *sk = sk_vsock(vsk);
>+
>+		if (vsock_addr_equals_addr(addr, &vsk->local_addr) &&
>+		    vsock_net_check_mode(sock_net(sk), net))
>+			return sk;
>
> 		if (addr->svm_port == vsk->local_addr.svm_port &&
> 		    (vsk->local_addr.svm_cid == VMADDR_CID_ANY ||
>-		     addr->svm_cid == VMADDR_CID_ANY))
>-			return sk_vsock(vsk);
>+		     addr->svm_cid == VMADDR_CID_ANY) &&
>+		     vsock_net_check_mode(sock_net(sk), net))
>+			return sk;
> 	}
>
> 	return NULL;
> }
>
>-static struct sock *__vsock_find_connected_socket(struct sockaddr_vm *src,
>-						  struct sockaddr_vm *dst)
>+static struct sock *
>+__vsock_find_connected_socket_net(struct sockaddr_vm *src,
>+				  struct sockaddr_vm *dst, struct net *net)
> {
> 	struct vsock_sock *vsk;
>
> 	list_for_each_entry(vsk, vsock_connected_sockets(src, dst),
> 			    connected_table) {
>+		struct sock *sk = sk_vsock(vsk);
>+
> 		if (vsock_addr_equals_addr(src, &vsk->remote_addr) &&
>-		    dst->svm_port == vsk->local_addr.svm_port) {
>-			return sk_vsock(vsk);
>+		    dst->svm_port == vsk->local_addr.svm_port &&
>+		    vsock_net_check_mode(sock_net(sk), net)) {
>+			return sk;
> 		}
> 	}
>
>@@ -304,12 +381,13 @@ void vsock_remove_connected(struct vsock_sock *vsk)
> }
> EXPORT_SYMBOL_GPL(vsock_remove_connected);
>
>-struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr)
>+struct sock *vsock_find_bound_socket_net(struct sockaddr_vm *addr,
>+					 struct net *net)
> {
> 	struct sock *sk;
>
> 	spin_lock_bh(&vsock_table_lock);
>-	sk = __vsock_find_bound_socket(addr);
>+	sk = __vsock_find_bound_socket_net(addr, net);
> 	if (sk)
> 		sock_hold(sk);
>
>@@ -317,15 +395,22 @@ struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr)
>
> 	return sk;
> }
>+EXPORT_SYMBOL_GPL(vsock_find_bound_socket_net);
>+
>+struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr)
>+{
>+	return vsock_find_bound_socket_net(addr, NULL);
>+}
> EXPORT_SYMBOL_GPL(vsock_find_bound_socket);
>
>-struct sock *vsock_find_connected_socket(struct sockaddr_vm *src,
>-					 struct sockaddr_vm *dst)
>+struct sock *vsock_find_connected_socket_net(struct sockaddr_vm *src,
>+					     struct sockaddr_vm *dst,
>+					     struct net *net)
> {
> 	struct sock *sk;
>
> 	spin_lock_bh(&vsock_table_lock);
>-	sk = __vsock_find_connected_socket(src, dst);
>+	sk = __vsock_find_connected_socket_net(src, dst, net);
> 	if (sk)
> 		sock_hold(sk);
>
>@@ -333,6 +418,13 @@ struct sock *vsock_find_connected_socket(struct sockaddr_vm *src,
>
> 	return sk;
> }
>+EXPORT_SYMBOL_GPL(vsock_find_connected_socket_net);
>+
>+struct sock *vsock_find_connected_socket(struct sockaddr_vm *src,
>+					 struct sockaddr_vm *dst)
>+{
>+	return vsock_find_connected_socket_net(src, dst, NULL);
>+}
> EXPORT_SYMBOL_GPL(vsock_find_connected_socket);
>
> void vsock_remove_sock(struct vsock_sock *vsk)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ