lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z-w47H3qUXZe4seQ@redhat.com>
Date: Tue, 1 Apr 2025 20:05:16 +0100
From: Daniel P. Berrangé <berrange@...hat.com>
To: Stefano Garzarella <sgarzare@...hat.com>
Cc: Bobby Eshleman <bobbyeshleman@...il.com>,
	Jakub Kicinski <kuba@...nel.org>,
	"K. Y. Srinivasan" <kys@...rosoft.com>,
	Haiyang Zhang <haiyangz@...rosoft.com>,
	Wei Liu <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>,
	Stefan Hajnoczi <stefanha@...hat.com>,
	"Michael S. Tsirkin" <mst@...hat.com>,
	Jason Wang <jasowang@...hat.com>,
	Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
	Eugenio PĂ©rez <eperezma@...hat.com>,
	Bryan Tan <bryan-bt.tan@...adcom.com>,
	Vishnu Dasa <vishnu.dasa@...adcom.com>,
	Broadcom internal kernel review list <bcm-kernel-feedback-list@...adcom.com>,
	"David S. Miller" <davem@...emloft.net>,
	virtualization@...ts.linux.dev, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-hyperv@...r.kernel.org,
	kvm@...r.kernel.org
Subject: Re: [PATCH v2 0/3] vsock: add namespace support to vhost-vsock

On Fri, Mar 28, 2025 at 06:03:19PM +0100, Stefano Garzarella wrote:
> CCing Daniel
> 
> On Wed, Mar 12, 2025 at 01:59:34PM -0700, Bobby Eshleman wrote:
> > Picking up Stefano's v1 [1], this series adds netns support to
> > vhost-vsock. Unlike v1, this series does not address guest-to-host (g2h)
> > namespaces, defering that for future implementation and discussion.
> > 
> > Any vsock created with /dev/vhost-vsock is a global vsock, accessible
> > from any namespace. Any vsock created with /dev/vhost-vsock-netns is a
> > "scoped" vsock, accessible only to sockets in its namespace. If a global
> > vsock or scoped vsock share the same CID, the scoped vsock takes
> > precedence.
> > 
> > If a socket in a namespace connects with a global vsock, the CID becomes
> > unavailable to any VMM in that namespace when creating new vsocks. If
> > disconnected, the CID becomes available again.
> 
> I was talking about this feature with Daniel and he pointed out something
> interesting (Daniel please feel free to correct me):
> 
>     If we have a process in the host that does a listen(AF_VSOCK) in a
> namespace, can this receive connections from guests connected to
> /dev/vhost-vsock in any namespace?
> 
>     Should we provide something (e.g. sysctl/sysfs entry) to disable
> this behaviour, preventing a process in a namespace from receiving
> connections from the global vsock address space (i.e.      /dev/vhost-vsock
> VMs)?

I think my concern goes a bit beyond that, to the general conceptual
idea of sharing the CID space between the global vsocks and namespace
vsocks. So I'm not sure a sysctl would be sufficient...details later
below..

> I understand that by default maybe we should allow this behaviour in order
> to not break current applications, but in some cases the user may want to
> isolate sockets in a namespace also from being accessed by VMs running in
> the global vsock address space.
> 
> Indeed in this series we have talked mostly about the host -> guest path (as
> the direction of the connection), but little about the guest -> host path,
> maybe we should explain it better in the cover/commit
> descriptions/documentation.

> > Testing
> > 
> > QEMU with /dev/vhost-vsock-netns support:
> > 	https://github.com/beshleman/qemu/tree/vsock-netns
> > 
> > Test: Scoped vsocks isolated by namespace
> > 
> >  host# ip netns add ns1
> >  host# ip netns add ns2
> >  host# ip netns exec ns1 \
> > 				  qemu-system-x86_64 \
> > 					  -m 8G -smp 4 -cpu host -enable-kvm \
> > 					  -serial mon:stdio \
> > 					  -drive if=virtio,file=${IMAGE1} \
> > 					  -device vhost-vsock-pci,netns=on,guest-cid=15
> >  host# ip netns exec ns2 \
> > 				  qemu-system-x86_64 \
> > 					  -m 8G -smp 4 -cpu host -enable-kvm \
> > 					  -serial mon:stdio \
> > 					  -drive if=virtio,file=${IMAGE2} \
> > 					  -device vhost-vsock-pci,netns=on,guest-cid=15
> > 
> >  host# socat - VSOCK-CONNECT:15:1234
> >  2025/03/10 17:09:40 socat[255741] E connect(5, AF=40 cid:15 port:1234, 16): No such device
> > 
> >  host# echo foobar1 | sudo ip netns exec ns1 socat - VSOCK-CONNECT:15:1234
> >  host# echo foobar2 | sudo ip netns exec ns2 socat - VSOCK-CONNECT:15:1234
> > 
> >  vm1# socat - VSOCK-LISTEN:1234
> >  foobar1
> >  vm2# socat - VSOCK-LISTEN:1234
> >  foobar2
> > 
> > Test: Global vsocks accessible to any namespace
> > 
> >  host# qemu-system-x86_64 \
> > 	  -m 8G -smp 4 -cpu host -enable-kvm \
> > 	  -serial mon:stdio \
> > 	  -drive if=virtio,file=${IMAGE2} \
> > 	  -device vhost-vsock-pci,guest-cid=15,netns=off
> > 
> >  host# echo foobar | sudo ip netns exec ns1 socat - VSOCK-CONNECT:15:1234
> > 
> >  vm# socat - VSOCK-LISTEN:1234
> >  foobar
> > 
> > Test: Connecting to global vsock makes CID unavailble to namespace
> > 
> >  host# qemu-system-x86_64 \
> > 	  -m 8G -smp 4 -cpu host -enable-kvm \
> > 	  -serial mon:stdio \
> > 	  -drive if=virtio,file=${IMAGE2} \
> > 	  -device vhost-vsock-pci,guest-cid=15,netns=off
> > 
> >  vm# socat - VSOCK-LISTEN:1234
> > 
> >  host# sudo ip netns exec ns1 socat - VSOCK-CONNECT:15:1234
> >  host# ip netns exec ns1 \
> > 				  qemu-system-x86_64 \
> > 					  -m 8G -smp 4 -cpu host -enable-kvm \
> > 					  -serial mon:stdio \
> > 					  -drive if=virtio,file=${IMAGE1} \
> > 					  -device vhost-vsock-pci,netns=on,guest-cid=15
> > 
> >  qemu-system-x86_64: -device vhost-vsock-pci,netns=on,guest-cid=15: vhost-vsock: unable to set guest cid: Address already in use

I find it conceptually quite unsettling that the VSOCK CID address
space for AF_VSOCK is shared between the host and the namespace.
That feels contrary to how namespaces are more commonly used for
deterministically isolating resources between the namespace and the
host.

Naively I would expect that in a namespace, all VSOCK CIDs are
free for use, without having to concern yourself with what CIDs
are in use in the host now, or in future.

What happens if we reverse the QEMU order above, to get the
following scenario

   # Launch VM1 inside the NS
   host# ip netns exec ns1 \
  				  qemu-system-x86_64 \
  					  -m 8G -smp 4 -cpu host -enable-kvm \
  					  -serial mon:stdio \
  					  -drive if=virtio,file=${IMAGE1} \
  					  -device vhost-vsock-pci,netns=on,guest-cid=15
   # Launch VM2
   host# qemu-system-x86_64 \
  	  -m 8G -smp 4 -cpu host -enable-kvm \
  	  -serial mon:stdio \
  	  -drive if=virtio,file=${IMAGE2} \
  	  -device vhost-vsock-pci,guest-cid=15,netns=off
  
   vm1# socat - VSOCK-LISTEN:1234
   vm2# socat - VSOCK-LISTEN:1234

   host# socat - VSOCK-CONNECT:15:1234
     => Presume this connects to "VM2" running outside the NS

   host# sudo ip netns exec ns1 socat - VSOCK-CONNECT:15:1234

     => Does this connect to "VM1" inside the NS, or "VM2"
        outside the NS ?



With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ