netdev - Re: [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151215074700.GC30291@stefanha-x1.localdomain>
Date:	Tue, 15 Dec 2015 15:47:00 +0800
From:	Stefan Hajnoczi <stefanha@...hat.com>
To:	Alex Bennée <alex.bennee@...aro.org>
Cc:	kvm@...r.kernel.org, Matt Benjamin <mbenjamin@...hat.com>,
	Christoffer Dall <christoffer.dall@...aro.org>,
	netdev@...r.kernel.org, "Michael S. Tsirkin" <mst@...hat.com>,
	matt.ma@...aro.org, virtualization@...ts.linux-foundation.org,
	Asias He <asias@...hat.com>
Subject: Re: [PATCH v3 3/4] VSOCK: Introduce vhost-vsock.ko

On Fri, Dec 11, 2015 at 01:45:29PM +0000, Alex Bennée wrote:
> > +		if (head == vq->num) {
> > +			if (unlikely(vhost_enable_notify(&vsock->dev, vq))) {
> > +				vhost_disable_notify(&vsock->dev, vq);
> > +				continue;
> 
> Why are we doing this? If we enable something we then disable it? A
> comment as to what is going on here would be useful.

This is a standard optimization to avoid vmexits that other vhost
devices and QEMU implement too.

When the host begins pulling buffers off a virtqueue it first disables
guest->host notifications.  If the guest adds additional buffers while
the host is processing, the notification (vmexit) is skipped.  The host
re-enables guest->host notifications when it finishes virtqueue
processing.

If the guest added buffers after vhost_get_vq_desc() but before
vhost_enable_notify(), then vhost_enable_notify() returns true and the
host must process the buffers (i.e. restart the loop).  Failure to do so
could result in deadlocks because the guest didn't notify and the host
would be waiting for a notification.

I will add comments to the code.

> > +		vhost_add_used(vq, head, pkt->len); /* TODO should this
> > be sizeof(pkt->hdr) + pkt->len? */
> 
> TODO needs sorting our or removing.

Will fix in the next revision.

> > +	/* Respect global tx buf limitation */
> > +	mutex_lock(&vq->mutex);
> > +	while (pkt_len + vsock->total_tx_buf >
> > VIRTIO_VSOCK_MAX_TX_BUF_SIZE) {
> 
> I'm curious about the relationship between
> VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE above and VIRTIO_VSOCK_MAX_TX_BUF_SIZE
> just here. Why do we need to limit pkt_len to the smaller when really
> all that matters is pkt_len + vsock->total_tx_buf >
> VIRTIO_VSOCK_MAX_TX_BUF_SIZE?

There are two separate issues:

1. The total amount of pending data.  The idea is to stop queuing
   packets and make the caller wait until resources become available so
   that vhost_vsock.ko memory consumption is bounded.

   total_tx_buf len is an artificial limit that is lower than the actual
   virtqueue maximum data size.  Otherwise we could just rely on the
   virtqueue to limit the size but it can be very large.

2. Splitting data into packets that fit into rx virtqueue buffers.  The
   guest sets up the rx virtqueue with VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE
   buffers.  Here, vhost_vsock.ko is assuming that the rx virtqueue
   buffers are always VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE bytes so it
   splits data along this boundary.

   This is ugly because the guest could choose a different buffer size
   and the host has VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE hardcoded.  I'll
   look into eliminating this assumption.

> > +static void vhost_vsock_handle_ctl_kick(struct vhost_work *work)
> > +{
> > +	struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
> > +						  poll.work);
> > +	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
> > +						 dev);
> > +
> > +	pr_debug("%s vq=%p, vsock=%p\n", __func__, vq, vsock);
> > +}
> 
> This doesn't handle anything, it just prints debug stuff. Should this be
> a NOP function?

The control virtqueue is currently not used.  In the next revision this
function will be dropped.

> > +static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features)
> > +{
> > +	struct vhost_virtqueue *vq;
> > +	int i;
> > +
> > +	if (features & ~VHOST_VSOCK_FEATURES)
> > +		return -EOPNOTSUPP;
> > +
> > +	mutex_lock(&vsock->dev.mutex);
> > +	if ((features & (1 << VHOST_F_LOG_ALL)) &&
> > +	    !vhost_log_access_ok(&vsock->dev)) {
> > +		mutex_unlock(&vsock->dev.mutex);
> > +		return -EFAULT;
> > +	}
> > +
> > +	for (i = 0; i < VSOCK_VQ_MAX; i++) {
> > +		vq = &vsock->vqs[i].vq;
> > +		mutex_lock(&vq->mutex);
> > +		vq->acked_features = features;
> 
> Is this a user supplied flag? Should it be masked to valid values?

That is already done above where VHOST_VSOCK_FEATURES is checked.

Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)