netdev - Re: vsock broken after connect() returns EINTR (was Re: [PATCH net 2/2] vsock/test: Add test for SO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cosqkkilcmorj5kmivfn3qhd2ixmjnrx7b2gv6ueadvh344yrh@ppqrpwfaql7d>
Date: Tue, 18 Nov 2025 11:14:12 +0100
From: Stefano Garzarella <sgarzare@...hat.com>
To: Michal Luczaj <mhal@...x.co>
Cc: "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, netdev@...r.kernel.org
Subject: Re: vsock broken after connect() returns EINTR (was Re: [PATCH net
 2/2] vsock/test: Add test for SO_LINGER null ptr deref)

On Fri, Jul 25, 2025 at 11:06:52AM +0200, Michal Luczaj wrote:
>On 4/15/25 15:07, Stefano Garzarella wrote:
>> On Fri, Apr 11, 2025 at 04:43:35PM +0200, Michal Luczaj wrote:
>...
>>> Once connect() fails with EINTR (e.g. due to a signal delivery), retrying
>>> connect() (to the same listener) fails. That is what the code below was
>>> trying to show.
>>
>> mmm, something is going wrong in the vsock_connect().
>>
>> IIUC if we fails with EINTR, we are kind of resetting the socket.
>> Should we do the same we do in vsock_assign_transport() when we found
>> that we are changing transport?
>>
>> I mean calling release(), vsock_deassign_transport(). etc.
>> I'm worried about having pending packets in flight.
>>
>> BTW we need to investigate more, I agree.
>
>I took a look. Once I've added a condition to break the connect() loop
>(right after schedule_timeout(), on `sk_state == TCP_ESTABLISHED`), things
>got a bit clearer for me.

Great! and sorry for the delay!

>
>Because listener keeps child socket in the accept_queue _and_
>connected_table, every time we try to re-connect() our
>VIRTIO_VSOCK_OP_REQUEST is answered with VIRTIO_VSOCK_OP_RST, which kills
>the re-connect(). IOW:
>
>L = socket()
>listen(L)
>				S = socket()
>				connect(S, L)
>					S.sk_state = TCP_SYN_SENT
>					send VIRTIO_VSOCK_OP_REQUEST
>
>				connect() is interrupted
>					S.sk_state = TCP_CLOSE
>
>L receives REQUEST
>	C = socket()
>	C.sk_state = TCP_ESTABLISHED
>	add C to L.accept_queue
>	add C to connected_table
>	send VIRTIO_VSOCK_OP_RESPONSE
>
>				S receives RESPONSE
>					unexpected state:
>					S.sk_state != TCP_SYN_SENT
>					send VIRTIO_VSOCK_OP_RST
>
>C receives RST
>	virtio_transport_do_close()
>		C.sk_state = TCP_CLOSING
>
>				retry connect(S, L)
>					S.sk_state = TCP_SYN_SENT
>					send VIRTIO_VSOCK_OP_REQUEST
>
>C (not L!) receives REQUEST
>	send VIRTIO_VSOCK_OP_RST
>
>				S receives RST, connect() fails
>					S.sk_state = TCP_CLOSE
>					S.sk_err = ECONNRESET
>
>I was mistaken that flushing the accept queue, i.e. close(accept()), would
>be enough to drop C from connected_table and let the re-connect() succeed.
>In fact, for the removal to actually take place you need to wait a bit
>after the flushing close(). What's more, child's vsock_release() might
>throw a poorly timed OP_RST at a client -- affecting the re-connect().
>
>Now, one thing we can do about this is: nothing. Let the user space handle
>the client- and server-side consequences of client-side's signal delivery
>(or timeout).

Perhaps until we receive a specific request from a user, we could leave 
things as they are, because it really seems like a rare corner case to 
me.

>
>Another thing we can do is switch to a 3-way handshake. I think that would
>also eliminate the need for vsock_transport_cancel_pkt(), which was
>introduced in commit 380feae0def7 ("vsock: cancel packets when failing to
>connect").

Unfortunately, I think this is a problem. For now, each transport 
implements whatever it wants.

In the case of virtio-vsock, we have a specification, and making this 
change will require a change in the specification. In addition, we would 
still have to continue to support the old versions in some way, so I 
don't know if it just adds complexity.

>
>All the other options I was considering (based on the idea to send client
>-> server OP_SHUTDOWN on connect() interrupt) are racy. Even if we make
>server-side drop the half-open-broken child from accept_queue, user might
>race us with accept().
>
>L = socket()
>listen(L)
>				S = socket()
>				connect(S, L)
>					S.sk_state = TCP_SYN_SENT
>					send VIRTIO_VSOCK_OP_REQUEST
>
>				connect() is interrupted
>					S.sk_state = TCP_CLOSE
>					send VIRTIO_VSOCK_OP_SHUTDOWN|CHILD
>
>L receives REQUEST
>	C = socket()
>	C.sk_state = TCP_ESTABLISHED
>	add C to L.accept_queue
>	add C to connected_table
>	send VIRTIO_VSOCK_OP_RESPONSE
>
>				S receives RESPONSE
>					unexpected state:
>					S.sk_state != TCP_SYN_SENT
>					send VIRTIO_VSOCK_OP_RST
>
>C receives SHUTDOWN
>	if !flags.CHILD
>		send VIRTIO_VSOCK_OP_RST
>	virtio_transport_do_close()
>		C.sk_state = TCP_CLOSING
>	del C from connected_table
>	// if flags.CHILD
>	//	(schedule?) del C from L.accept_queue?
>	//	racy anyway
>
>L receives RST
>	nothing (as RST reply to client's RST won't be send)
>
>So that's all I could come up with. Please let me know what you think.

Thanks again for everything you're doing, much appreciated!

If you agree, I'd say let's leave things as they are for now.

Thanks,
Stefano