[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c3b34d3c-29ea-5a0f-24d3-483836faa7ba@salutedevices.com>
Date: Tue, 5 Sep 2023 07:30:38 +0300
From: Arseniy Krasnov <avkrasnov@...utedevices.com>
To: Stefano Garzarella <sgarzare@...hat.com>
CC: Stefan Hajnoczi <stefanha@...hat.com>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
<kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, "Michael S. Tsirkin"
<mst@...hat.com>, Jason Wang <jasowang@...hat.com>, Bobby Eshleman
<bobby.eshleman@...edance.com>, <kvm@...r.kernel.org>,
<virtualization@...ts.linux-foundation.org>, <netdev@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <kernel@...rdevices.ru>, <oxffffaa@...il.com>
Subject: Re: [PATCH net-next v7 4/4] vsock/virtio: MSG_ZEROCOPY flag support
On 04.09.2023 11:21, Stefano Garzarella wrote:
> On Sun, Sep 03, 2023 at 11:13:23AM +0300, Arseniy Krasnov wrote:
>>
>>
>> On 01.09.2023 15:30, Stefano Garzarella wrote:
>>> On Sun, Aug 27, 2023 at 11:54:36AM +0300, Arseniy Krasnov wrote:
>>>> This adds handling of MSG_ZEROCOPY flag on transmission path: if this
>>>> flag is set and zerocopy transmission is possible (enabled in socket
>>>> options and transport allows zerocopy), then non-linear skb will be
>>>> created and filled with the pages of user's buffer. Pages of user's
>>>> buffer are locked in memory by 'get_user_pages()'. Second thing that
>>>> this patch does is replace type of skb owning: instead of calling
>>>> 'skb_set_owner_sk_safe()' it calls 'skb_set_owner_w()'. Reason of this
>>>> change is that '__zerocopy_sg_from_iter()' increments 'sk_wmem_alloc'
>>>> of socket, so to decrease this field correctly proper skb destructor is
>>>> needed: 'sock_wfree()'. This destructor is set by 'skb_set_owner_w()'.
>>>>
>>>> Signed-off-by: Arseniy Krasnov <avkrasnov@...utedevices.com>
>
> [...]
>
>>>>
>>>> -/* Returns a new packet on success, otherwise returns NULL.
>>>> - *
>>>> - * If NULL is returned, errp is set to a negative errno.
>>>> - */
>>>> -static struct sk_buff *
>>>> -virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *info,
>>>> - size_t len,
>>>> - u32 src_cid,
>>>> - u32 src_port,
>>>> - u32 dst_cid,
>>>> - u32 dst_port)
>>>> -{
>>>> - const size_t skb_len = VIRTIO_VSOCK_SKB_HEADROOM + len;
>>>> - struct virtio_vsock_hdr *hdr;
>>>> - struct sk_buff *skb;
>>>> +static bool virtio_transport_can_zcopy(struct virtio_vsock_pkt_info *info,
>>>> + size_t max_to_send)
>>> ^
>>> I'd call it `pkt_len`, `max_to_send` is confusing IMHO. I didn't
>>> initially if it was the number of buffers or bytes.
>>>
>>>> +{
>>>> + const struct virtio_transport *t_ops;
>>>> + struct iov_iter *iov_iter;
>>>> +
>>>> + if (!info->msg)
>>>> + return false;
>>>> +
>>>> + iov_iter = &info->msg->msg_iter;
>>>> +
>>>> + if (iov_iter->iov_offset)
>>>> + return false;
>>>> +
>>>> + /* We can't send whole iov. */
>>>> + if (iov_iter->count > max_to_send)
>>>> + return false;
>>>> +
>>>> + /* Check that transport can send data in zerocopy mode. */
>>>> + t_ops = virtio_transport_get_ops(info->vsk);
>>>> +
>>>> + if (t_ops->can_msgzerocopy) {
>>>
>>> So if `can_msgzerocopy` is not implemented, we always return true after
>>> this point. Should we mention it in the .can_msgzerocopy documentation?
^^^
Sorry, ops again. Just checked this code during comments fixing. It is correct
to "return true;" Idea is:
if (generic conditions for MSG_ZEROCOPY == false)
return false;// can't zerocopy
if (t_ops->can_msgzerocopy) //transport needs extra check
return t_ops->can_msgzerocopy();
return true;//transport doesn't require extra check and generic conditions above are OK -> can zerocopy
But anyway:
1) I'll add comment in 'struct virtio_transport' for '.can_msgzerocopy' that this callback is
not mandatory - just additional transport specific check.
2) Add test for fallback to copy.
Thanks, Arseniy
>>
>> Ops, this is my mistake, I must return 'false' in this case. Seems I didn't
>> catch this problem with my tests, because there was no test case where
>> zerocopy will fallback to copy!
>>
>> I'll fix it and add new test!
>
> yep, I agree!
>
>>
>>>
>>> Can we also mention in the commit description why this is need only for
>>> virtio_tranport and not for vhost and loopback?
>>>
>>>> + int pages_in_iov = iov_iter_npages(iov_iter, MAX_SKB_FRAGS);
>>>> + int pages_to_send = min(pages_in_iov, MAX_SKB_FRAGS);
>>>> +
>>>> + return t_ops->can_msgzerocopy(pages_to_send);
>>>> + }
>>>> +
>>>> + return true;
>>>> +}
>>>> +
>
> [...]
>
>>>> @@ -270,6 +395,17 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
>>>> break;
>>>> }
>>>>
>>>> + /* This is last skb to send this portion of data. */
>>>
>>> Sorry I didn't get it :-(
>>>
>>> Can you elaborate this a bit more?
>>
>> I mean that we iterate over user's buffer here, allocating skb on each
>> iteration. And for last skb for this buffer we initialize completion
>> for user (we need to allocate one completion for one syscall).
>
> Okay, so maybe we should explain better also in the code comment.
>>
>> Thanks for review, I'll fix all other comments and resend patchset when
>> 'net-next' will be opened again.
>
> Cool, thanks!
> Stefano
>
Powered by blists - more mailing lists