lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 18 Mar 2023 21:01:45 +0300
From:   Arseniy Krasnov <avkrasnov@...rdevices.ru>
To:     Bobby Eshleman <bobbyeshleman@...il.com>
CC:     Stefan Hajnoczi <stefanha@...hat.com>,
        Stefano Garzarella <sgarzare@...hat.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Bobby Eshleman <bobby.eshleman@...edance.com>,
        <kvm@...r.kernel.org>, <virtualization@...ts.linux-foundation.org>,
        <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <kernel@...rdevices.ru>, <oxffffaa@...il.com>
Subject: Re: [RFC PATCH v1] virtio/vsock: allocate multiple skbuffs on tx



On 18.03.2023 00:52, Bobby Eshleman wrote:
> On Fri, Mar 17, 2023 at 01:38:39PM +0300, Arseniy Krasnov wrote:
>> This adds small optimization for tx path: instead of allocating single
>> skbuff on every call to transport, allocate multiple skbuffs until
>> credit space allows, thus trying to send as much as possible data without
>> return to af_vsock.c.
> 
> Hey Arseniy, I really like this optimization. I have a few
> questions/comments below.
> 
>>
>> Signed-off-by: Arseniy Krasnov <AVKrasnov@...rdevices.ru>
>> ---
>>  net/vmw_vsock/virtio_transport_common.c | 45 +++++++++++++++++--------
>>  1 file changed, 31 insertions(+), 14 deletions(-)
>>
>> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>> index 6564192e7f20..cda587196475 100644
>> --- a/net/vmw_vsock/virtio_transport_common.c
>> +++ b/net/vmw_vsock/virtio_transport_common.c
>> @@ -196,7 +196,8 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
>>  	const struct virtio_transport *t_ops;
>>  	struct virtio_vsock_sock *vvs;
>>  	u32 pkt_len = info->pkt_len;
>> -	struct sk_buff *skb;
>> +	u32 rest_len;
>> +	int ret;
>>  
>>  	info->type = virtio_transport_get_type(sk_vsock(vsk));
>>  
>> @@ -216,10 +217,6 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
>>  
>>  	vvs = vsk->trans;
>>  
>> -	/* we can send less than pkt_len bytes */
>> -	if (pkt_len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE)
>> -		pkt_len = VIRTIO_VSOCK_MAX_PKT_BUF_SIZE;
>> -
>>  	/* virtio_transport_get_credit might return less than pkt_len credit */
>>  	pkt_len = virtio_transport_get_credit(vvs, pkt_len);
>>  
>> @@ -227,17 +224,37 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
>>  	if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
>>  		return pkt_len;
>>  
>> -	skb = virtio_transport_alloc_skb(info, pkt_len,
>> -					 src_cid, src_port,
>> -					 dst_cid, dst_port);
>> -	if (!skb) {
>> -		virtio_transport_put_credit(vvs, pkt_len);
>> -		return -ENOMEM;
>> -	}
>> +	rest_len = pkt_len;
>>  
>> -	virtio_transport_inc_tx_pkt(vvs, skb);
>> +	do {
>> +		struct sk_buff *skb;
>> +		size_t skb_len;
>> +
>> +		skb_len = min_t(u32, VIRTIO_VSOCK_MAX_PKT_BUF_SIZE, rest_len);
>> +
>> +		skb = virtio_transport_alloc_skb(info, skb_len,
>> +						 src_cid, src_port,
>> +						 dst_cid, dst_port);
>> +		if (!skb) {
>> +			ret = -ENOMEM;
>> +			goto out;
>> +		}
> 
> In this case, if a previous round of the loop succeeded with send_pkt(),
> I think that we may still want to return the number of bytes that have
> successfully been sent so far?
> 
Hello! Thanks for review!

Yes, You are right, seems this patch breaks partial send return value. For example for the
following iov (suppose each '.iov_len' is 64Kb, e.g. max packet length):

[0] = { .iov_base = ptr0, .iov_len = len0 },
[1] = { .iov_base = NULL, .iov_len = len1 },
[2] = { .iov_base = ptr2, .iov_len = len2 }

transport callback will send element 0, but NULL iov_base of element 1 will cause tx failure.
Transport callback returns error (no information about transmitted skbuffs), but element 0 was
already passed to virtio/vhost path.

Current logic will return length of element 0 (it will be accounted to return from send syscall),
then calls transport again with invalid element 1 which triggers error.

I'm not sure that it is correct (at least in this single patch) to return number of bytes sent,
because tx loop in af_vsock.c compares length of user's buffer and number of bytes sent to break
tx loop (or loop is terminated when transport returns error). For above iov, we return length of
element 0 without length of invalid element 1, but not error (so loop exit condition never won't
be true). Moreover, with this approach only first failed to tx skbuff will return error. For second,
third, etc. skbuffs we get only number of bytes.

I think may be we can use socket's 'sk_err' field here: when tx callback failed to send data(no
matter it is first byte or last byte of middle byte), it returns number of bytes sent (it will be
0 if first skbuff was failed to sent) and sets 'sk_err'. Good thing here is that tx loop in af_vsock.c
already has check for 'sk_err' value and break loop if error occurred. This way looks like 'errno'
concept a little bit: transport returns number of bytes, 'sk_err' contains error. So in current
patch it will look like this: instead of setting 'ret' with error, i set 'sk_err' with error,
but callback returns number of bytes transmitted.

May be we need review from some more experienced guy, Stefano Garzarella, what do You think?

Thanks, Arseniy
>>  
>> -	return t_ops->send_pkt(skb);
>> +		virtio_transport_inc_tx_pkt(vvs, skb);
>> +
>> +		ret = t_ops->send_pkt(skb);
>> +
>> +		if (ret < 0)
>> +			goto out;
> 
> Ditto here.
> 
>> +
>> +		rest_len -= skb_len;
>> +	} while (rest_len);
>> +
>> +	return pkt_len;
>> +
>> +out:
>> +	virtio_transport_put_credit(vvs, rest_len);
>> +	return ret;
>>  }
>>  
>>  static bool virtio_transport_inc_rx_pkt(struct virtio_vsock_sock *vvs,
>> -- 
>> 2.25.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ