linux-kernel - Re: [PATCH net-next 1/2] vsock: Linger on unsent data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <p5toijsqhehc4kp7gztto3nmrqa33f62duozoxn7u5hh6d2xpe@lfzy6kdszegf>
Date: Fri, 11 Apr 2025 15:43:44 +0200
From: Stefano Garzarella <sgarzare@...hat.com>
To: Michal Luczaj <mhal@...x.co>
Cc: "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, 
	"Michael S. Tsirkin" <mst@...hat.com>, Jason Wang <jasowang@...hat.com>, 
	Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, Eugenio Pérez <eperezma@...hat.com>, 
	Stefan Hajnoczi <stefanha@...hat.com>, virtualization@...ts.linux.dev, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH net-next 1/2] vsock: Linger on unsent data

On Mon, Apr 07, 2025 at 08:41:43PM +0200, Michal Luczaj wrote:
>Change the behaviour of a lingering close(): instead of waiting for all
>data to be consumed, block until data is considered sent, i.e. until worker
>picks the packets and decrements virtio_vsock_sock::bytes_unsent down to 0.
>
>Do linger on shutdown() just as well.

I think this should go in a separate patch.

>
>Signed-off-by: Michal Luczaj <mhal@...x.co>
>---
> include/net/af_vsock.h                  |  1 +
> net/vmw_vsock/af_vsock.c                | 25 +++++++++++++++++++++++++
> net/vmw_vsock/virtio_transport_common.c | 25 +++----------------------
> 3 files changed, 29 insertions(+), 22 deletions(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index 9e85424c834353d016a527070dd62e15ff3bfce1..bd8b88d70423051dd05fc445fe37971af631ba03 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -221,6 +221,7 @@ void vsock_for_each_connected_socket(struct vsock_transport *transport,
> 				     void (*fn)(struct sock *sk));
> int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk);
> bool vsock_find_cid(unsigned int cid);
>+void vsock_linger(struct sock *sk, long timeout);
>
> /**** TAP ****/
>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index fc6afbc8d6806a4d98c66abc3af4bd139c583b08..383c6644d047589035c0439c47d1440273e67ea9 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1013,6 +1013,29 @@ static int vsock_getname(struct socket *sock,
> 	return err;
> }
>
>+void vsock_linger(struct sock *sk, long timeout)
>+{
>+	if (timeout) {

I would prefer to avoid a whole nested block and return immediately
in such a case. (It's pre-existing, but since we are moving this code
I'd fix it).

	if (!timeout)
		return;

>+		DEFINE_WAIT_FUNC(wait, woken_wake_function);
>+		ssize_t (*unsent)(struct vsock_sock *vsk);
>+		struct vsock_sock *vsk = vsock_sk(sk);
>+

transport->unsent_bytes can be NULL, this will panic with hyperv or
vmci transport, especially because we now call this function in
vsock_shutdown().

I'd skip that call if transports don't implement it, but please add
a comment on top of this function about that.

>+		unsent = vsk->transport->unsent_bytes;
>+		if (!unsent)
>+			return;
>+
>+		add_wait_queue(sk_sleep(sk), &wait);
>+
>+		do {
>+			if (sk_wait_event(sk, &timeout, unsent(vsk) == 0, &wait))
>+				break;
>+		} while (!signal_pending(current) && timeout);
>+
>+		remove_wait_queue(sk_sleep(sk), &wait);
>+	}
>+}
>+EXPORT_SYMBOL_GPL(vsock_linger);
>+
> static int vsock_shutdown(struct socket *sock, int mode)
> {
> 	int err;
>@@ -1056,6 +1079,8 @@ static int vsock_shutdown(struct socket *sock, int mode)
> 		if (sock_type_connectible(sk->sk_type)) {
> 			sock_reset_flag(sk, SOCK_DONE);
> 			vsock_send_shutdown(sk, mode);
>+			if (sock_flag(sk, SOCK_LINGER))
>+				vsock_linger(sk, sk->sk_lingertime);
> 		}
> 	}
>
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 7f7de6d8809655fe522749fbbc9025df71f071bd..66ff2e694e474ad16f70248cc1dc235f4e1ebaa1 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -1192,23 +1192,6 @@ static void virtio_transport_remove_sock(struct vsock_sock *vsk)
> 	vsock_remove_sock(vsk);
> }
>
>-static void virtio_transport_wait_close(struct sock *sk, long timeout)
>-{
>-	if (timeout) {
>-		DEFINE_WAIT_FUNC(wait, woken_wake_function);
>-
>-		add_wait_queue(sk_sleep(sk), &wait);
>-
>-		do {
>-			if (sk_wait_event(sk, &timeout,
>-					  sock_flag(sk, SOCK_DONE), &wait))
>-				break;
>-		} while (!signal_pending(current) && timeout);
>-
>-		remove_wait_queue(sk_sleep(sk), &wait);
>-	}
>-}

Anyway in this patch we are doing 3 things:
1. check unsent_bytes during wait_close()
2. move wait_close in af_vsock.c adding vsock_linger()
3. call vsock_linger() also during vsock_shutdown()

I think this is a clear signal that we need to split it into at least
3 patches.

>-
> static void virtio_transport_cancel_close_work(struct vsock_sock *vsk,
> 					       bool cancel_timeout)
> {
>@@ -1279,15 +1262,13 @@ static bool virtio_transport_close(struct vsock_sock *vsk)
> 		(void)virtio_transport_shutdown(vsk, SHUTDOWN_MASK);
>
> 	if (sock_flag(sk, SOCK_LINGER) && !(current->flags & PF_EXITING))
>-		virtio_transport_wait_close(sk, sk->sk_lingertime);
>+		vsock_linger(sk, sk->sk_lingertime);

Might it make sense, as you did for vsock_shutdown(), to move this
to af_vsock.c (in another patch) so that we support it for all
transports?

>
>-	if (sock_flag(sk, SOCK_DONE)) {
>+	if (sock_flag(sk, SOCK_DONE))
> 		return true;
>-	}

Please avoid this unrelated changes, if you really want to fix them,
add another patch in the series where to fix them.

>
> 	sock_hold(sk);
>-	INIT_DELAYED_WORK(&vsk->close_work,
>-			  virtio_transport_close_timeout);
>+	INIT_DELAYED_WORK(&vsk->close_work, virtio_transport_close_timeout);

Ditto.

These 2 could go together in a single `cleanup` patch, although I
usually avoid it so that `git blame` makes sense. But if we want to
make checkpatch happy, that's fine.

Thanks,
Stefano

> 	vsk->close_work_scheduled = true;
> 	schedule_delayed_work(&vsk->close_work, VSOCK_CLOSE_TIMEOUT);
> 	return false;
>
>-- 
>2.49.0
>