[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e08c55df-6ea2-cf7f-d7c9-91e55913ab16@redhat.com>
Date: Tue, 24 Apr 2018 08:54:52 +0800
From: Jason Wang <jasowang@...hat.com>
To: Tiwei Bie <tiwei.bie@...el.com>
Cc: mst@...hat.com, wexu@...hat.com,
virtualization@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
jfreimann@...hat.com
Subject: Re: [RFC v2] virtio: support packed ring
On 2018年04月23日 17:29, Tiwei Bie wrote:
> On Mon, Apr 23, 2018 at 01:42:14PM +0800, Jason Wang wrote:
>> On 2018年04月01日 22:12, Tiwei Bie wrote:
>>> Hello everyone,
>>>
>>> This RFC implements packed ring support for virtio driver.
>>>
>>> The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented
>>> by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html
>>> Minor changes are needed for the vhost code, e.g. to kick the guest.
>>>
>>> TODO:
>>> - Refinements and bug fixes;
>>> - Split into small patches;
>>> - Test indirect descriptor support;
>>> - Test/fix event suppression support;
>>> - Test devices other than net;
>>>
>>> RFC v1 -> RFC v2:
>>> - Add indirect descriptor support - compile test only;
>>> - Add event suppression supprt - compile test only;
>>> - Move vring_packed_init() out of uapi (Jason, MST);
>>> - Merge two loops into one in virtqueue_add_packed() (Jason);
>>> - Split vring_unmap_one() for packed ring and split ring (Jason);
>>> - Avoid using '%' operator (Jason);
>>> - Rename free_head -> next_avail_idx (Jason);
>>> - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
>>> - Some other refinements and bug fixes;
>>>
>>> Thanks!
>>>
>>> Signed-off-by: Tiwei Bie <tiwei.bie@...el.com>
>>> ---
>>> drivers/virtio/virtio_ring.c | 1094 +++++++++++++++++++++++++++++-------
>>> include/linux/virtio_ring.h | 8 +-
>>> include/uapi/linux/virtio_config.h | 12 +-
>>> include/uapi/linux/virtio_ring.h | 61 ++
>>> 4 files changed, 980 insertions(+), 195 deletions(-)
>>>
>>> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
>>> index 71458f493cf8..0515dca34d77 100644
>>> --- a/drivers/virtio/virtio_ring.c
>>> +++ b/drivers/virtio/virtio_ring.c
>>> @@ -58,14 +58,15 @@
>> [...]
>>
>>> +
>>> + if (vq->indirect) {
>>> + u32 len;
>>> +
>>> + desc = vq->desc_state[head].indir_desc;
>>> + /* Free the indirect table, if any, now that it's unmapped. */
>>> + if (!desc)
>>> + goto out;
>>> +
>>> + len = virtio32_to_cpu(vq->vq.vdev,
>>> + vq->vring_packed.desc[head].len);
>>> +
>>> + BUG_ON(!(vq->vring_packed.desc[head].flags &
>>> + cpu_to_virtio16(vq->vq.vdev, VRING_DESC_F_INDIRECT)));
>> It looks to me spec does not force to keep VRING_DESC_F_INDIRECT here. So we
>> can safely remove this BUG_ON() here.
>>
>>> + BUG_ON(len == 0 || len % sizeof(struct vring_packed_desc));
>> Len could be ignored for used descriptor according to the spec, so we need
>> remove this BUG_ON() too.
> Yeah, you're right! The BUG_ON() isn't right. I'll remove it.
> And I think something related to this in the spec isn't very
> clear currently.
>
> In the spec, there are below words:
>
> https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L272
> """
> In descriptors with VIRTQ_DESC_F_INDIRECT set VIRTQ_DESC_F_WRITE
> is reserved and is ignored by the device.
> """
>
> So when device writes back an used descriptor in this case,
> device may not set the VIRTQ_DESC_F_WRITE flag as the flag
> is reserved and should be ignored.
>
> https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L170
> """
> Element Length is reserved for used descriptors without the
> VIRTQ_DESC_F_WRITE flag, and is ignored by drivers.
> """
>
> And this is the way how driver ignores the `len` in an used
> descriptor.
>
> https://github.com/oasis-tcs/virtio-spec/blob/d4fec517dfcf/packed-ring.tex#L241
> """
> To increase ring capacity the driver can store a (read-only
> by the device) table of indirect descriptors anywhere in memory,
> and insert a descriptor in the main virtqueue (with \field{Flags}
> bit VIRTQ_DESC_F_INDIRECT on) that refers to a buffer element
> containing this indirect descriptor table;
> """
>
> So the indirect descriptors in the table are read-only by
> the device. And the only descriptor which is writeable by
> the device is the descriptor in the main virtqueue (with
> Flags bit VIRTQ_DESC_F_INDIRECT on). So if we ignore the
> `len` in this descriptor, we won't be able to get the
> length of the data written by the device.
>
> So I think the `len` in this descriptor will carry the
> length of the data written by the device (if the buffers
> are writable to the device) even if the VIRTQ_DESC_F_WRITE
> isn't set by the device. How do you think?
Yes I think so. But we'd better need clarification from Michael.
>
>
>> The reason is we don't touch descriptor ring in the case of split, so
>> BUG_ON()s may help there.
>>
>>> +
>>> + for (j = 0; j < len / sizeof(struct vring_packed_desc); j++)
>>> + vring_unmap_one_packed(vq, &desc[j]);
>>> +
>>> + kfree(desc);
>>> + vq->desc_state[head].indir_desc = NULL;
>>> + } else if (ctx) {
>>> + *ctx = vq->desc_state[head].indir_desc;
>>> + }
>>> +
>>> +out:
>>> + return vq->desc_state[head].num;
>>> +}
>>> +
>>> +static inline bool more_used_split(const struct vring_virtqueue *vq)
>>> {
>>> return vq->last_used_idx != virtio16_to_cpu(vq->vq.vdev, vq->vring.used->idx);
>>> }
>>> +static inline bool more_used_packed(const struct vring_virtqueue *vq)
>>> +{
>>> + u16 last_used, flags;
>>> + bool avail, used;
>>> +
>>> + if (vq->vq.num_free == vq->vring_packed.num)
>>> + return false;
>>> +
>>> + last_used = vq->last_used_idx;
>>> + flags = virtio16_to_cpu(vq->vq.vdev,
>>> + vq->vring_packed.desc[last_used].flags);
>>> + avail = flags & VRING_DESC_F_AVAIL(1);
>>> + used = flags & VRING_DESC_F_USED(1);
>>> +
>>> + return avail == used;
>>> +}
>> This looks interesting, spec said:
>>
>> "
>> Thus VIRTQ_DESC_F_AVAIL and VIRTQ_DESC_F_USED bits are different for an
>> available descriptor and
>> equal for a used descriptor.
>> Note that this observation is mostly useful for sanity-checking as these are
>> necessary but not sufficient
>> conditions - for example, all descriptors are zero-initialized. To detect
>> used and available descriptors it is
>> possible for drivers and devices to keep track of the last observed value of
>> VIRTQ_DESC_F_USED/VIRTQ_-
>> DESC_F_AVAIL. Other techniques to detect
>> VIRTQ_DESC_F_AVAIL/VIRTQ_DESC_F_USED bit changes
>> might also be possible.
>> "
>>
>> So it looks to me it was not sufficient, looking at the example codes in
>> spec, do we need to track last seen used_wrap_counter here?
> I don't think we have to track used_wrap_counter in
> driver. There was a discussion on this:
>
> https://lists.oasis-open.org/archives/virtio-dev/201802/msg00177.html
>
> And after that, below sentence was added (it's also
> in the above words you quoted):
>
> """
> Other techniques to detect
> VIRTQ_DESC_F_AVAIL/VIRTQ_DESC_F_USED bit changes
> might also be possible.
> """
>
> Best regards,
> Tiwei Bie
I see, the extra condition "if (vq->vq.num_free ==
vq->vring_packed.num)" help in this case.
Thanks
>
>> Thanks
Powered by blists - more mailing lists