linux-kernel - Re: [PATCH 2/2] vsock/virtio: Don't reset the created SOCKET during s2r

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <vxz37vz262nujwe6qfyorblpkuvol3ix6ikzv7lpyx5pilx76e@s2wixscnvvuu>
Date: Thu, 13 Feb 2025 10:58:55 +0100
From: Stefano Garzarella <sgarzare@...hat.com>
To: Junnan Wu <junnan01.wu@...sung.com>
Cc: davem@...emloft.net, edumazet@...gle.com, eperezma@...hat.com, 
	horms@...nel.org, jasowang@...hat.com, kuba@...nel.org, kvm@...r.kernel.org, 
	lei19.wang@...sung.com, linux-kernel@...r.kernel.org, mst@...hat.com, 
	netdev@...r.kernel.org, pabeni@...hat.com, q1.huang@...sung.com, stefanha@...hat.com, 
	virtualization@...ts.linux.dev, xuanzhuo@...ux.alibaba.com, ying01.gao@...sung.com, 
	ying123.xu@...sung.com
Subject: Re: [PATCH 2/2] vsock/virtio: Don't reset the created SOCKET during
 s2r

On Wed, Feb 12, 2025 at 12:48:43PM +0800, Junnan Wu wrote:
>>On Mon, Feb 10, 2025 at 12:48:03PM +0100, leonardi@...hat.com wrote:
>>>Like for the other patch, some maintainers have not been CCd.
>>
>>Yes, please use `scripts/get_maintainer.pl`.
>>
>
>Ok, I will add other maintainers by this script in next push.
>
>>>
>>>On Fri, Feb 07, 2025 at 01:20:33PM +0800, Junnan Wu wrote:
>>>>From: Ying Gao <ying01.gao@...sung.com>
>>>>
>>>>If suspend is executed during vsock communication and the
>>>>socket is reset, the original socket will be unusable after resume.
>>
>>Why? (I mean for a good commit description)
>>
>>>>
>>>>Judge the value of vdev->priv in function virtio_vsock_vqs_del,
>>>>only when the function is invoked by virtio_vsock_remove,
>>>>all vsock connections will be reset.
>>>>
>>>The second part of the commit message is not that clear, do you mind
>>>rephrasing it?
>>
>>+1 on that
>>
>
>Well, I will rephrase it in next version.
>
>>Also in this case, why checking `vdev->priv` fixes the issue?
>>
>>>
>>>>Signed-off-by: Ying Gao <ying01.gao@...sung.com>
>>>Missing Co-developed-by?
>>>>Signed-off-by: Junnan Wu <junnan01.wu@...sung.com>
>>>
>>>
>>>>---
>>>>net/vmw_vsock/virtio_transport.c | 6 ++++--
>>>>1 file changed, 4 insertions(+), 2 deletions(-)
>>>>
>>>>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>>>>index 9eefd0fba92b..9df609581755 100644
>>>>--- a/net/vmw_vsock/virtio_transport.c
>>>>+++ b/net/vmw_vsock/virtio_transport.c
>>>>@@ -717,8 +717,10 @@ static void virtio_vsock_vqs_del(struct virtio_vsock *vsock)
>>>>	struct sk_buff *skb;
>>>>
>>>>	/* Reset all connected sockets when the VQs disappear */
>>>>-	vsock_for_each_connected_socket(&virtio_transport.transport,
>>>>-					virtio_vsock_reset_sock);
>>>I would add a comment explaining why you are adding this check.
>>
>>Yes, please.
>>
>
>Ok, I left a comment here in next version
>
>>>>+	if (!vdev->priv) {
>>>>+		vsock_for_each_connected_socket(&virtio_transport.transport,
>>>>+						virtio_vsock_reset_sock);
>>>>+	}
>>
>>Okay, after looking at the code I understood why, but please write it
>>into the commit next time!
>>
>>virtio_vsock_vqs_del() is called in 2 cases:
>>1 - in virtio_vsock_remove() after setting `vdev->priv` to null since
>>     the drive is about to be unloaded because the device is for example
>>     removed (hot-unplug)
>>
>>2 - in virtio_vsock_freeze() when suspending, but in this case
>>     `vdev->priv` is not touched.
>>
>>I don't think is a good idea using that because in the future it could
>>change. So better to add a parameter to virtio_vsock_vqs_del() to
>>differentiate the 2 use cases.
>>
>>
>>That said, I think this patch is wrong:
>>
>>We are deallocating virtqueues, so all packets that are "in flight" will
>>be completely discarded. Our transport (virtqueues) has no mechanism to
>>retransmit them, so those packets would be lost forever. So we cannot
>>guarantee the reliability of SOCK_STREAM sockets for example.
>>
>>In any case, after a suspension, many connections will be expired in the
>>host anyway, so does it make sense to keep them open in the guest?
>>
>
>If host still holds vsock connection during suspend,
>I think guest should keep them open at this case.
>
>Because we find a scenario that when we do freeze at the time that vsock
>connection is communicating, and after restore, upper application
>is trying to continue sending msg via vsock, then error `ENOTCONN`
>returned in function `vsock_connectible_sendmsg`. But host does not realize
>this thing and still waiting to receive msg with old connect.
>If host doesn't close old connection, it will cause that guest
>can never connect to host via vsock because of error `EPIPE` returned.
>
>If we freeze vsock after sending and receiving data operation completed,
>this error will not happen, and guest can still connect to host after resume.
>
>For example:
>In suitaion 1), if we do following steps
>    step 1) Host start a vsock server
>    step 2) Guest start a vsock client which will no-limited sending data
>    step 3) Guest freeze and resume
>Then vsock connection will be broken and guest can never connect to host via
>vsock untill Host reset vsock server.
>
>And in suitaion 2), if we do following steps
>    step1) Host start a vsock server
>    step2) Guest start a vsock client and send some data
>    step3) After client completed transmit, Guest freeze and resume
>    step4) Guest start a new vsock client and send some data
>In this suitaion, host server don't need to reset, and guest client works well
>after resume.

Okay, but this is not what this patch is doing, right?
Or have I missed something?

>
>>If you want to support this use case, you must first provide a way to
>>keep those packets somewhere (e.g. avoiding to remove the virtqueues?),
>>but I honestly don't understand the use case.
>>
>
>In cases guest sending no-reply-required packet via vsock,
>when guest suspend, the sending action will also suspend
>and no packets will loss after resume.

You can try this simple example to check if it works or not:

guest$ dd if=/dev/urandom of=bigfile bs=1M count=10240
guest$ md5sum bigfile
e412f2803a89da265d53a28dea0f0da7  bigfile

host$ nc --vsock -p 1234 -l > bigfile
guest$ cat bigfile | nc --vsock 2 1234

# while sending do a suspend/resume cycle

# Without your patch, nc should fail, so the user knows the
# communication was wrong, with your patch should not fail.

host$ md5sum bigfile


Is the md5sum the same? If not it means you lost packets and we can't do 
that.

>
>And when host is sending packet via vsock when guest suspend and Vq disapper,
>like you mentioned, those packets will loss.
>But I think those packets should be keep in host device side,
>and promise that once guest resume,
>get them in host device and continue sending.

The host will stop using virtqueue after the driver calls 
`virtio_reset_device()`, so we should handle all the packets already 
queued in the RX virtqueue, because when the host put them in the 
virtqueue it doesn't have any way to track them, so should be up to the 
driver in the guest to stop the device and then check all the buffer 
already queued.

But currently we also call 
`virtio_vsock_skb_queue_purge(&vsock->send_pkt_queue);` which will 
discard all the packets queued by application in the guests that weren't 
even queued in the virtqueue.

So again, this patch as it is, it's absolutely not right.

I understand the use case and it's clear to me now, but please write it 
in the commit description.

In summary, if we want to support your use case (and that is fine by 
me), we need to do better in the driver:

- we must not purge `send_pkt_queue`
- we need to make sure that all buffers that the host has put in the RX 
   virtqueue are handled by the guest
- we need to make sure that all buffers that the guest has put in the TX 
   virtqueue are handled by the host or put back on top of send_pkt_queue

Thanks,
Stefano

>
>Thanks,
>Junnan Wu
>
>>To be clear, this behavior is intended, and it's for example the same as
>>when suspending the VM is the hypervisor directly, which after that, it
>>sends an event to the guest, just to close all connections because it's
>>complicated to keep them active.
>>
>>Thanks,
>>Stefano
>>
>>
>>
>>>>
>>>>	/* Stop all work handlers to make sure no one is accessing the device,
>>>>	 * so we can safely call virtio_reset_device().
>>>>--
>>>>2.34.1
>>>>
>>>
>>>I am not familiar with freeze/resume, but I don't see any problems
>>>with this patch.
>>>
>>>Thank you,
>>>Luigi
>>>
>