netdev - Re: [PATCH v10 11/18] virtio/vsock: dequeue callback for SOCK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6b833ccf-ea93-db6a-4743-463ac1cfe817@kaspersky.com>
Date:   Fri, 4 Jun 2021 16:12:23 +0300
From:   Arseny Krasnov <arseny.krasnov@...persky.com>
To:     Stefano Garzarella <sgarzare@...hat.com>
CC:     Stefan Hajnoczi <stefanha@...hat.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Jason Wang <jasowang@...hat.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Jorgen Hansen <jhansen@...are.com>,
        Norbert Slusarek <nslusarek@....net>,
        Colin Ian King <colin.king@...onical.com>,
        Andra Paraschiv <andraprs@...zon.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "virtualization@...ts.linux-foundation.org" 
        <virtualization@...ts.linux-foundation.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "oxffffaa@...il.com" <oxffffaa@...il.com>
Subject: Re: [PATCH v10 11/18] virtio/vsock: dequeue callback for
 SOCK_SEQPACKET


On 03.06.2021 17:45, Stefano Garzarella wrote:
> On Thu, May 20, 2021 at 10:17:58PM +0300, Arseny Krasnov wrote:
>> Callback fetches RW packets from rx queue of socket until whole record
>> is copied(if user's buffer is full, user is not woken up). This is done
>> to not stall sender, because if we wake up user and it leaves syscall,
>> nobody will send credit update for rest of record, and sender will wait
>> for next enter of read syscall at receiver's side. So if user buffer is
>> full, we just send credit update and drop data.
>>
>> Signed-off-by: Arseny Krasnov <arseny.krasnov@...persky.com>
>> ---
>> v9 -> v10:
>> 1) Number of dequeued bytes incremented even in case when
>>    user's buffer is full.
>> 2) Use 'msg_data_left()' instead of direct access to 'msg_hdr'.
>> 3) Rename variable 'err' to 'dequeued_len', in case of error
>>    it has negative value.
>>
>> include/linux/virtio_vsock.h            |  5 ++
>> net/vmw_vsock/virtio_transport_common.c | 65 +++++++++++++++++++++++++
>> 2 files changed, 70 insertions(+)
>>
>> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>> index dc636b727179..02acf6e9ae04 100644
>> --- a/include/linux/virtio_vsock.h
>> +++ b/include/linux/virtio_vsock.h
>> @@ -80,6 +80,11 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
>> 			       struct msghdr *msg,
>> 			       size_t len, int flags);
>>
>> +ssize_t
>> +virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
>> +				   struct msghdr *msg,
>> +				   int flags,
>> +				   bool *msg_ready);
>> s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
>> s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
>>
>> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>> index ad0d34d41444..61349b2ea7fe 100644
>> --- a/net/vmw_vsock/virtio_transport_common.c
>> +++ b/net/vmw_vsock/virtio_transport_common.c
>> @@ -393,6 +393,59 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
>> 	return err;
>> }
>>
>> +static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
>> +						 struct msghdr *msg,
>> +						 int flags,
>> +						 bool *msg_ready)
>> +{
>> +	struct virtio_vsock_sock *vvs = vsk->trans;
>> +	struct virtio_vsock_pkt *pkt;
>> +	int dequeued_len = 0;
>> +	size_t user_buf_len = msg_data_left(msg);
>> +
>> +	*msg_ready = false;
>> +	spin_lock_bh(&vvs->rx_lock);
>> +
>> +	while (!*msg_ready && !list_empty(&vvs->rx_queue) && dequeued_len >= 0) {
> I'
>
>> +		size_t bytes_to_copy;
>> +		size_t pkt_len;
>> +
>> +		pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
>> +		pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
>> +		bytes_to_copy = min(user_buf_len, pkt_len);
>> +
>> +		if (bytes_to_copy) {
>> +			/* sk_lock is held by caller so no one else can dequeue.
>> +			 * Unlock rx_lock since memcpy_to_msg() may sleep.
>> +			 */
>> +			spin_unlock_bh(&vvs->rx_lock);
>> +
>> +			if (memcpy_to_msg(msg, pkt->buf, bytes_to_copy))
>> +				dequeued_len = -EINVAL;
> I think here is better to return the error returned by memcpy_to_msg(), 
> as we do in the other place where we use memcpy_to_msg().
>
> I mean something like this:
> 			err = memcpy_to_msgmsg, pkt->buf, bytes_to_copy);
> 			if (err)
> 				dequeued_len = err;
Ack
>> +			else
>> +				user_buf_len -= bytes_to_copy;
>> +
>> +			spin_lock_bh(&vvs->rx_lock);
>> +		}
>> +
> Maybe here we can simply break the cycle if we have an error:
> 		if (dequeued_len < 0)
> 			break;
>
> Or we can refactor a bit, simplifying the while() condition and also the 
> code in this way (not tested):
>
> 	while (!*msg_ready && !list_empty(&vvs->rx_queue)) {
> 		...
>
> 		if (bytes_to_copy) {
> 			int err;
>
> 			/* ...
> 			*/
> 			spin_unlock_bh(&vvs->rx_lock);
> 			err = memcpy_to_msgmsg, pkt->buf, bytes_to_copy);
> 			if (err) {
> 				dequeued_len = err;
> 				goto out;
> 			}
> 			spin_lock_bh(&vvs->rx_lock);
>
> 			user_buf_len -= bytes_to_copy;
> 		}
>
> 		dequeued_len += pkt_len;
>
> 		if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SEQ_EOR)
> 			*msg_ready = true;
>
> 		virtio_transport_dec_rx_pkt(vvs, pkt);
> 		list_del(&pkt->list);
> 		virtio_transport_free_pkt(pkt);
> 	}
>
> out:
> 	spin_unlock_bh(&vvs->rx_lock);
>
> 	virtio_transport_send_credit_update(vsk);
>
> 	return dequeued_len;
> }

I think we can't do 'goto out' or break, because in case of error, we still need

to free packet. It is possible to do something like this:

		virtio_transport_dec_rx_pkt(vvs, pkt);
		list_del(&pkt->list);
		virtio_transport_free_pkt(pkt);

		if (dequeued_len < 0)
			break;

>
>