lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 27 May 2019 12:44:47 +0200
From:   Stefano Garzarella <>
To:     Stefan Hajnoczi <>,
        Jorgen Hansen <>
Cc:, Dexuan Cui <>,
        "David S. Miller" <>,
        Vishnu Dasa <>,
        "K. Y. Srinivasan" <>,
        Haiyang Zhang <>,
        Stephen Hemminger <>,
        Sasha Levin <>
Subject: Re: [RFC] vsock: proposal to support multiple transports at runtime

On Thu, May 23, 2019 at 04:37:03PM +0100, Stefan Hajnoczi wrote:
> On Tue, May 14, 2019 at 10:15:43AM +0200, Stefano Garzarella wrote:
> > Hi guys,
> > I'm currently interested on implement a multi-transport support for VSOCK in
> > order to handle nested VMs.
> > 
> > As Stefan suggested me, I started to look at this discussion:
> >
> > Below I tried to summarize a proposal for a discussion, following the ideas
> > from Dexuan, Jorgen, and Stefan.
> > 
> > 
> > We can define two types of transport that we have to handle at the same time
> > (e.g. in a nested VM we would have both types of transport running together):
> > 
> > - 'host side transport', it runs in the host and it is used to communicate with
> >   the guests of a specific hypervisor (KVM, VMWare or HyperV)
> > 
> >   Should we support multiple 'host side transport' running at the same time?
> > 
> > - 'guest side transport'. it runs in the guest and it is used to communicate
> >   with the host transport
> I find this terminology confusing.  Perhaps "host->guest" (your 'host
> side transport') and "guest->host" (your 'guest side transport') is
> clearer?

I agree, "host->guest" and "guest->host" are better, I'll use them.

> Or maybe the nested virtualization terminology of L2 transport (your
> 'host side transport') and L0 transport (your 'guest side transport')?
> Here we are the L1 guest and L0 is the host and L2 is our nested guest.

I'm confused, if L2 is the nested guest, it should be the
'guest side transport'. Did I miss anything?

Maybe it is another point to your first proposal :)

> > 
> > 
> > The main goal is to find a way to decide what transport use in these cases:
> > 1. connect() / sendto()
> > 
> > 	a. use the 'host side transport', if the destination is the guest
> > 	   (dest_cid > VMADDR_CID_HOST).
> > 	   If we want to support multiple 'host side transport' running at the
> > 	   same time, we should assign CIDs uniquely across all transports.
> > 	   In this way, a packet generated by the host side will get directed
> > 	   to the appropriate transport based on the CID
> The multiple host side transport case is unlikely to be necessary on x86
> where only one hypervisor uses VMX at any given time.  But eventually it
> may happen so it's wise to at least allow it in the design.

Okay, I was in doubt, but I'll keep it in the design.

> > 
> > 	b. use the 'guest side transport', if the destination is the host
> > 	   (dest_cid == VMADDR_CID_HOST)
> Makes sense to me.
> > 
> > 
> > 2. listen() / recvfrom()
> > 
> > 	a. use the 'host side transport', if the socket is bound to
> > 	   VMADDR_CID_HOST, or it is bound to VMADDR_CID_ANY and there is no
> > 	   guest transport.
> > 	   We could also define a new VMADDR_CID_LISTEN_FROM_GUEST in order to
> > 	   address this case.
> > 	   If we want to support multiple 'host side transport' running at the
> > 	   same time, we should find a way to allow an application to bound a
> > 	   specific host transport (e.g. adding new VMADDR_CID_LISTEN_FROM_KVM,
> VMADDR_CID_LISTEN_FROM_HYPERV isn't very flexible.  What if my service
> should only be available to a subset of VMware VMs?

You're right, it is not very flexible.

> Instead it might be more appropriate to use network namespaces to create
> independent AF_VSOCK addressing domains.  Then you could have two
> separate groups of VMware VMs and selectively listen to just one group.

Does AF_VSOCK support network namespace or it could be another
improvement to take care? (IIUC is not currently supported)

A possible issue that I'm seeing with netns is if they are used for
other purpose (e.g. to isolate the network of a VM), we should have
multiple instances of the application, one per netns.

> > 
> > 	b. use the 'guest side transport', if the socket is bound to local CID
> > 	   different from the VMADDR_CID_HOST (guest CID get with
> > 	   (to be backward compatible).
> > 	   Also in this case, we could define a new VMADDR_CID_LISTEN_FROM_HOST.
> Two additional topics:
> 1. How will loading af_vsock.ko change?

I'd allow the loading of af_vsock.ko without any transport.
Maybe we should move the MODULE_ALIAS_NETPROTO(PF_VSOCK) from the
vmci_transport.ko to the af_vsock.ko, but this can impact the VMware

>    In particular, can an
>    application create a socket in af_vsock.ko without any loaded
>    transport?  Can it enter listen state without any loaded transport
>    (this seems useful with VMADDR_CID_ANY)?

I'll check if we can allow listen sockets without any loaded transport,
but I think could be a nice behaviour to have.

> 2. Does your proposed behavior match VMware's existing nested vsock
>    semantics?

I'm not sure, but I tried to follow the Jorgen's answers to the original
thread. I hope that this proposal matches the VMware semantic.

@Jorgen, do you have any advice?


Powered by blists - more mailing lists