lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 16 Mar 2018 19:36:47 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     Tiwei Bie <tiwei.bie@...el.com>
Cc:     mst@...hat.com, virtualization@...ts.linux-foundation.org,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        wexu@...hat.com, jfreimann@...hat.com
Subject: Re: [PATCH RFC 2/2] virtio_ring: support packed ring



On 2018年03月16日 18:04, Tiwei Bie wrote:
> On Fri, Mar 16, 2018 at 04:34:28PM +0800, Jason Wang wrote:
>> On 2018年03月16日 15:40, Tiwei Bie wrote:
>>> On Fri, Mar 16, 2018 at 02:44:12PM +0800, Jason Wang wrote:
>>>> On 2018年03月16日 14:10, Tiwei Bie wrote:
>>>>> On Fri, Mar 16, 2018 at 12:03:25PM +0800, Jason Wang wrote:
>>>>>> On 2018年02月23日 19:18, Tiwei Bie wrote:
>>>>>>> Signed-off-by: Tiwei Bie <tiwei.bie@...el.com>
>>>>>>> ---
>>>>>>>      drivers/virtio/virtio_ring.c | 699 +++++++++++++++++++++++++++++++++++++------
>>>>>>>      include/linux/virtio_ring.h  |   8 +-
>>>>>>>      2 files changed, 618 insertions(+), 89 deletions(-)
>>>

[...]

>>>>>>> +		}
>>>>>>> +	}
>>>>>>> +	for (; n < (out_sgs + in_sgs); n++) {
>>>>>>> +		for (sg = sgs[n]; sg; sg = sg_next(sg)) {
>>>>>>> +			dma_addr_t addr = vring_map_one_sg(vq, sg, DMA_FROM_DEVICE);
>>>>>>> +			if (vring_mapping_error(vq, addr))
>>>>>>> +				goto unmap_release;
>>>>>>> +
>>>>>>> +			flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT |
>>>>>>> +					VRING_DESC_F_WRITE |
>>>>>>> +					VRING_DESC_F_AVAIL(vq->wrap_counter) |
>>>>>>> +					VRING_DESC_F_USED(!vq->wrap_counter));
>>>>>>> +			if (!indirect && i == head)
>>>>>>> +				head_flags = flags;
>>>>>>> +			else
>>>>>>> +				desc[i].flags = flags;
>>>>>>> +
>>>>>>> +			desc[i].addr = cpu_to_virtio64(_vq->vdev, addr);
>>>>>>> +			desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length);
>>>>>>> +			desc[i].id = cpu_to_virtio32(_vq->vdev, head);
>>>>>>> +			prev = i;
>>>>>>> +			i++;
>>>>>>> +			if (!indirect && i >= vq->vring_packed.num) {
>>>>>>> +				i = 0;
>>>>>>> +				vq->wrap_counter ^= 1;
>>>>>>> +			}
>>>>>>> +		}
>>>>>>> +	}
>>>>>>> +	/* Last one doesn't continue. */
>>>>>>> +	if (!indirect && (head + 1) % vq->vring_packed.num == i)
>>>>>>> +		head_flags &= cpu_to_virtio16(_vq->vdev, ~VRING_DESC_F_NEXT);
>>>>>> I can't get the why we need this here.
>>>>> If only one desc is used, we will need to clear the
>>>>> VRING_DESC_F_NEXT flag from the head_flags.
>>>> Yes, I meant why following desc[prev].flags won't work for this?
>>> Because the update of desc[head].flags (in above case,
>>> prev == head) has been delayed. The flags is saved in
>>> head_flags.
>> Ok, but let's try to avoid modular here e.g tracking the number of sgs in a
>> counter.
>>
>> And I see lots of duplication in the above two loops, I believe we can unify
>> them with a a single loop. the only difference is dma direction and write
>> flag.
> The above implementation for packed ring is basically
> an mirror of the existing implementation in split ring
> as I want to keep the coding style consistent. Below
> is the corresponding code in split ring:
>
> static inline int virtqueue_add(struct virtqueue *_vq,
> 				struct scatterlist *sgs[],
> 				unsigned int total_sg,
> 				unsigned int out_sgs,
> 				unsigned int in_sgs,
> 				void *data,
> 				void *ctx,
> 				gfp_t gfp)
> {
> 	......
>
> 	for (n = 0; n < out_sgs; n++) {
> 		for (sg = sgs[n]; sg; sg = sg_next(sg)) {
> 			dma_addr_t addr = vring_map_one_sg(vq, sg, DMA_TO_DEVICE);
> 			if (vring_mapping_error(vq, addr))
> 				goto unmap_release;
>
> 			desc[i].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT);
> 			desc[i].addr = cpu_to_virtio64(_vq->vdev, addr);
> 			desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length);
> 			prev = i;
> 			i = virtio16_to_cpu(_vq->vdev, desc[i].next);
> 		}
> 	}
> 	for (; n < (out_sgs + in_sgs); n++) {
> 		for (sg = sgs[n]; sg; sg = sg_next(sg)) {
> 			dma_addr_t addr = vring_map_one_sg(vq, sg, DMA_FROM_DEVICE);
> 			if (vring_mapping_error(vq, addr))
> 				goto unmap_release;
>
> 			desc[i].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT | VRING_DESC_F_WRITE);
> 			desc[i].addr = cpu_to_virtio64(_vq->vdev, addr);
> 			desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length);
> 			prev = i;
> 			i = virtio16_to_cpu(_vq->vdev, desc[i].next);
> 		}
> 	}
>
> 	......
> }

There's no need for such consistency especially consider it's a new kind 
of ring. Anyway, you can stick to it.

[...]

>>>>>>> +	} else
>>>>>>> +		vq->free_head = i;
>>>>>> ID is only valid in the last descriptor in the list, so head + 1 should be
>>>>>> ok too?
>>>>> I don't really get your point. The vq->free_head stores
>>>>> the index of the next avail desc.
>>>> I think I get your idea now, free_head has two meanings:
>>>>
>>>> - next avail index
>>>> - buffer id
>>> In my design, free_head is just the index of the next
>>> avail desc.
>>>
>>> Driver can set anything to buffer ID.
>> Then you need another method to track id to context e.g hashing.
> I keep the context in desc_state[desc_idx]. So there is
> no extra method needed to track the context.

Well, it works for this patch but my reply was for "set anything to 
buffer ID". The size of desc_state is limited, so in fact you can't use 
a value greater than vq.num.


>

[...]

>> @@ -1096,17 +1599,21 @@ struct virtqueue *vring_create_virtqueue(
>>>>>>>      	if (!queue) {
>>>>>>>      		/* Try to get a single page. You are my only hope! */
>>>>>>> -		queue = vring_alloc_queue(vdev, vring_size(num, vring_align),
>>>>>>> +		queue = vring_alloc_queue(vdev, __vring_size(num, vring_align,
>>>>>>> +							     packed),
>>>>>>>      					  &dma_addr, GFP_KERNEL|__GFP_ZERO);
>>>>>>>      	}
>>>>>>>      	if (!queue)
>>>>>>>      		return NULL;
>>>>>>> -	queue_size_in_bytes = vring_size(num, vring_align);
>>>>>>> -	vring_init(&vring, num, queue, vring_align);
>>>>>>> +	queue_size_in_bytes = __vring_size(num, vring_align, packed);
>>>>>>> +	if (packed)
>>>>>>> +		vring_packed_init(&vring.vring_packed, num, queue, vring_align);
>>>>>>> +	else
>>>>>>> +		vring_init(&vring.vring_split, num, queue, vring_align);
>>>>>> Let's rename vring_init to vring_init_split() like other helpers?
>>>>> The vring_init() is a public API in include/uapi/linux/virtio_ring.h.
>>>>> I don't think we can rename it.
>>>> I see, then this need more thoughts to unify the API.
>>> My thought is to keep the old API as is, and introduce
>>> new types and helpers for packed ring.
>> I admit it's not a fault of this patch. But we'd better think of this in the
>> future, consider we may have new kinds of ring.
>>
>>> More details can be found in this patch:
>>> https://lkml.org/lkml/2018/2/23/243
>>> (PS. The type which has bit fields is just for reference,
>>>    and will be changed in next version.)
>>>
>>> Do you have any other suggestions?
>> No.
> Hmm.. Sorry, I didn't describe my question well.
> I mean do you have any suggestions about the API
> design for packed ring in uapi header? Currently
> I introduced below two new helpers:
>
> static inline void vring_packed_init(struct vring_packed *vr, unsigned int num,
> 				     void *p, unsigned long align);
> static inline unsigned vring_packed_size(unsigned int num, unsigned long align);
>
> When new rings are introduced in the future, above
> helpers can't be reused. Maybe we should make the
> helpers be able to determine the ring type?

Let's wait for Michael's comment here. Generally, I fail to understand 
why vring_init() become a part of uapi. Git grep shows the only use 
cases are virtio_test/vringh_test.

Thanks

>
> Best regards,
> Tiwei Bie
>
>> Thanks
>>
>>> Best regards,
>>> Tiwei Bie
>>>
>>>>>>> -	vq = __vring_new_virtqueue(index, vring, vdev, weak_barriers, context,
>>>>>>> -				   notify, callback, name);
>>>>>>> +	vq = __vring_new_virtqueue(index, vring, packed, vdev, weak_barriers,
>>>>>>> +				   context, notify, callback, name);
>>>>>>>      	if (!vq) {
>>>>>>>      		vring_free_queue(vdev, queue_size_in_bytes, queue,
>>>>>>>      				 dma_addr);
>>> [...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ