[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <39b2e9fd-601b-189d-39a9-914e5574524c@sberdevices.ru>
Date: Sat, 17 Dec 2022 19:42:04 +0000
From: Arseniy Krasnov <AVKrasnov@...rdevices.ru>
To: Stefano Garzarella <sgarzare@...hat.com>,
Stefan Hajnoczi <stefanha@...hat.com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
kernel <kernel@...rdevices.ru>,
Krasnov Arseniy <oxffffaa@...il.com>,
Arseniy Krasnov <AVKrasnov@...rdevices.ru>
Subject: [RFC PATCH v1 0/2] virtio/vsock: fix mutual rx/tx hungup
Hello,
seems I found strange thing(may be a bug) where sender('tx' later) and
receiver('rx' later) could stuck forever. Potential fix is in the first
patch, second patch contains reproducer, based on vsock test suite.
Reproducer is simple: tx just sends data to rx by 'write() syscall, rx
dequeues it using 'read()' syscall and uses 'poll()' for waiting. I run
server in host and client in guest.
rx side params:
1) SO_VM_SOCKETS_BUFFER_SIZE is 256Kb(e.g. default).
2) SO_RCVLOWAT is 128Kb.
What happens in the reproducer step by step:
1) tx tries to send 256Kb + 1 byte (in a single 'write()')
2) tx sends 256Kb, data reaches rx (rx_bytes == 256Kb)
3) tx waits for space in 'write()' to send last 1 byte
4) rx does poll(), (rx_bytes >= rcvlowat) 256Kb >= 128Kb, POLLIN is set
5) rx reads 64Kb, credit update is not sent due to *
6) rx does poll(), (rx_bytes >= rcvlowat) 192Kb >= 128Kb, POLLIN is set
7) rx reads 64Kb, credit update is not sent due to *
8) rx does poll(), (rx_bytes >= rcvlowat) 128Kb >= 128Kb, POLLIN is set
9) rx reads 64Kb, credit update is not sent due to *
10) rx does poll(), (rx_bytes < rcvlowat) 64Kb < 128Kb, rx waits in poll()
* is optimization in 'virtio_transport_stream_do_dequeue()' which
sends OP_CREDIT_UPDATE only when we have not too much space -
less than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.
Now tx side waits for space inside write() and rx waits in poll() for
'rx_bytes' to reach SO_RCVLOWAT value. Both sides will wait forever. I
think, possible fix is to send credit update not only when we have too
small space, but also when number of bytes in receive queue is smaller
than SO_RCVLOWAT thus not enough to wake up sleeping reader. I'm not
sure about correctness of this idea, but anyway - I think that problem
above exists. What do You think?
Patchset was rebased and tested on skbuff v7 patch from Bobby Eshleman:
https://lore.kernel.org/netdev/20221213192843.421032-1-bobby.eshleman@bytedance.com/
Arseniy Krasnov(2):
virtio/vsock: send credit update depending on SO_RCVLOWAT
vsock_test: mutual hungup reproducer
net/vmw_vsock/virtio_transport_common.c | 9 +++-
tools/testing/vsock/vsock_test.c | 78 +++++++++++++++++++++++++++++++++
2 files changed, 85 insertions(+), 2 deletions(-)
--
2.25.1
Powered by blists - more mailing lists