netdev - Re: [PATCH] vhost: introduce mdev based hardware backend

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <552fc91c-2eb6-8870-3077-e808e7e0917b@redhat.com>
Date:   Fri, 27 Sep 2019 20:15:41 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     "Michael S. Tsirkin" <mst@...hat.com>
Cc:     Tiwei Bie <tiwei.bie@...el.com>, alex.williamson@...hat.com,
        maxime.coquelin@...hat.com, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
        netdev@...r.kernel.org, dan.daly@...el.com,
        cunming.liang@...el.com, zhihong.wang@...el.com,
        lingshan.zhu@...el.com
Subject: Re: [PATCH] vhost: introduce mdev based hardware backend


On 2019/9/27 下午5:38, Michael S. Tsirkin wrote:
> On Fri, Sep 27, 2019 at 04:47:43PM +0800, Jason Wang wrote:
>> On 2019/9/27 下午12:54, Tiwei Bie wrote:
>>> On Fri, Sep 27, 2019 at 11:46:06AM +0800, Jason Wang wrote:
>>>> On 2019/9/26 下午12:54, Tiwei Bie wrote:
>>>>> +
>>>>> +static long vhost_mdev_start(struct vhost_mdev *m)
>>>>> +{
>>>>> +	struct mdev_device *mdev = m->mdev;
>>>>> +	const struct virtio_mdev_device_ops *ops = mdev_get_dev_ops(mdev);
>>>>> +	struct virtio_mdev_callback cb;
>>>>> +	struct vhost_virtqueue *vq;
>>>>> +	int idx;
>>>>> +
>>>>> +	ops->set_features(mdev, m->acked_features);
>>>>> +
>>>>> +	mdev_add_status(mdev, VIRTIO_CONFIG_S_FEATURES_OK);
>>>>> +	if (!(mdev_get_status(mdev) & VIRTIO_CONFIG_S_FEATURES_OK))
>>>>> +		goto reset;
>>>>> +
>>>>> +	for (idx = 0; idx < m->nvqs; idx++) {
>>>>> +		vq = &m->vqs[idx];
>>>>> +
>>>>> +		if (!vq->desc || !vq->avail || !vq->used)
>>>>> +			break;
>>>>> +
>>>>> +		if (ops->set_vq_state(mdev, idx, vq->last_avail_idx))
>>>>> +			goto reset;
>>>> If we do set_vq_state() in SET_VRING_BASE, we won't need this step here.
>>> Yeah, I plan to do it in the next version.
>>>
>>>>> +
>>>>> +		/*
>>>>> +		 * In vhost-mdev, userspace should pass ring addresses
>>>>> +		 * in guest physical addresses when IOMMU is disabled or
>>>>> +		 * IOVAs when IOMMU is enabled.
>>>>> +		 */
>>>> A question here, consider we're using noiommu mode. If guest physical
>>>> address is passed here, how can a device use that?
>>>>
>>>> I believe you meant "host physical address" here? And it also have the
>>>> implication that the HPA should be continuous (e.g using hugetlbfs).
>>> The comment is talking about the virtual IOMMU (i.e. iotlb in vhost).
>>> It should be rephrased to cover the noiommu case as well. Thanks for
>>> spotting this.
>>>
>>>
>>>>> +
>>>>> +	switch (cmd) {
>>>>> +	case VHOST_MDEV_SET_STATE:
>>>>> +		r = vhost_set_state(m, argp);
>>>>> +		break;
>>>>> +	case VHOST_GET_FEATURES:
>>>>> +		r = vhost_get_features(m, argp);
>>>>> +		break;
>>>>> +	case VHOST_SET_FEATURES:
>>>>> +		r = vhost_set_features(m, argp);
>>>>> +		break;
>>>>> +	case VHOST_GET_VRING_BASE:
>>>>> +		r = vhost_get_vring_base(m, argp);
>>>>> +		break;
>>>> Does it mean the SET_VRING_BASE may only take affect after
>>>> VHOST_MEV_SET_STATE?
>>> Yeah, in this version, SET_VRING_BASE won't set the base to the
>>> device directly. But I plan to not delay this anymore in the next
>>> version to support the SET_STATUS.
>>>
>>>>> +	default:
>>>>> +		r = vhost_dev_ioctl(&m->dev, cmd, argp);
>>>>> +		if (r == -ENOIOCTLCMD)
>>>>> +			r = vhost_vring_ioctl(&m->dev, cmd, argp);
>>>>> +	}
>>>>> +
>>>>> +	mutex_unlock(&m->mutex);
>>>>> +	return r;
>>>>> +}
>>>>> +
>>>>> +static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = {
>>>>> +	.name		= "vfio-vhost-mdev",
>>>>> +	.open		= vhost_mdev_open,
>>>>> +	.release	= vhost_mdev_release,
>>>>> +	.ioctl		= vhost_mdev_unlocked_ioctl,
>>>>> +};
>>>>> +
>>>>> +static int vhost_mdev_probe(struct device *dev)
>>>>> +{
>>>>> +	struct mdev_device *mdev = mdev_from_dev(dev);
>>>>> +	const struct virtio_mdev_device_ops *ops = mdev_get_dev_ops(mdev);
>>>>> +	struct vhost_mdev *m;
>>>>> +	int nvqs, r;
>>>>> +
>>>>> +	m = kzalloc(sizeof(*m), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
>>>>> +	if (!m)
>>>>> +		return -ENOMEM;
>>>>> +
>>>>> +	mutex_init(&m->mutex);
>>>>> +
>>>>> +	nvqs = ops->get_queue_max(mdev);
>>>>> +	m->nvqs = nvqs;
>>>> The name could be confusing, get_queue_max() is to get the maximum number of
>>>> entries for a virtqueue supported by this device.
>>> OK. It might be better to rename it to something like:
>>>
>>> 	get_vq_num_max()
>>>
>>> which is more consistent with the set_vq_num().
>>>
>>>> It looks to me that we need another API to query the maximum number of
>>>> virtqueues supported by the device.
>>> Yeah.
>>>
>>> Thanks,
>>> Tiwei
>>
>> One problem here:
>>
>> Consider if we want to support multiqueue, how did userspace know about
>> this?
> There's a feature bit for this, isn't there?


Yes, but it needs to know how many queue pairs are available.

Thanks


>
>> Note this information could be fetched from get_config() via a device
>> specific way, do we want ioctl for accessing that area?
>>
>> Thanks