lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <70371863-fa71-48e0-a1e5-fee83e7ca37c@rbox.co>
Date: Fri, 25 Jul 2025 11:06:52 +0200
From: Michal Luczaj <mhal@...x.co>
To: Stefano Garzarella <sgarzare@...hat.com>
Cc: "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
 netdev@...r.kernel.org
Subject: Re: vsock broken after connect() returns EINTR (was Re: [PATCH net
 2/2] vsock/test: Add test for SO_LINGER null ptr deref)

On 4/15/25 15:07, Stefano Garzarella wrote:
> On Fri, Apr 11, 2025 at 04:43:35PM +0200, Michal Luczaj wrote:
...
>> Once connect() fails with EINTR (e.g. due to a signal delivery), retrying
>> connect() (to the same listener) fails. That is what the code below was
>> trying to show.
> 
> mmm, something is going wrong in the vsock_connect().
> 
> IIUC if we fails with EINTR, we are kind of resetting the socket.
> Should we do the same we do in vsock_assign_transport() when we found 
> that we are changing transport?
> 
> I mean calling release(), vsock_deassign_transport(). etc.
> I'm worried about having pending packets in flight.
> 
> BTW we need to investigate more, I agree.

I took a look. Once I've added a condition to break the connect() loop
(right after schedule_timeout(), on `sk_state == TCP_ESTABLISHED`), things
got a bit clearer for me.

Because listener keeps child socket in the accept_queue _and_
connected_table, every time we try to re-connect() our
VIRTIO_VSOCK_OP_REQUEST is answered with VIRTIO_VSOCK_OP_RST, which kills
the re-connect(). IOW:

L = socket()
listen(L)
				S = socket()
				connect(S, L)
					S.sk_state = TCP_SYN_SENT
					send VIRTIO_VSOCK_OP_REQUEST

				connect() is interrupted
					S.sk_state = TCP_CLOSE

L receives REQUEST
	C = socket()
	C.sk_state = TCP_ESTABLISHED
	add C to L.accept_queue
	add C to connected_table
	send VIRTIO_VSOCK_OP_RESPONSE

				S receives RESPONSE
					unexpected state:
					S.sk_state != TCP_SYN_SENT
					send VIRTIO_VSOCK_OP_RST

C receives RST
	virtio_transport_do_close()
		C.sk_state = TCP_CLOSING

				retry connect(S, L)
					S.sk_state = TCP_SYN_SENT
					send VIRTIO_VSOCK_OP_REQUEST

C (not L!) receives REQUEST
	send VIRTIO_VSOCK_OP_RST

				S receives RST, connect() fails
					S.sk_state = TCP_CLOSE
					S.sk_err = ECONNRESET

I was mistaken that flushing the accept queue, i.e. close(accept()), would
be enough to drop C from connected_table and let the re-connect() succeed.
In fact, for the removal to actually take place you need to wait a bit
after the flushing close(). What's more, child's vsock_release() might
throw a poorly timed OP_RST at a client -- affecting the re-connect().

Now, one thing we can do about this is: nothing. Let the user space handle
the client- and server-side consequences of client-side's signal delivery
(or timeout).

Another thing we can do is switch to a 3-way handshake. I think that would
also eliminate the need for vsock_transport_cancel_pkt(), which was
introduced in commit 380feae0def7 ("vsock: cancel packets when failing to
connect").

All the other options I was considering (based on the idea to send client
-> server OP_SHUTDOWN on connect() interrupt) are racy. Even if we make
server-side drop the half-open-broken child from accept_queue, user might
race us with accept().

L = socket()
listen(L)
				S = socket()
				connect(S, L)
					S.sk_state = TCP_SYN_SENT
					send VIRTIO_VSOCK_OP_REQUEST

				connect() is interrupted
					S.sk_state = TCP_CLOSE
					send VIRTIO_VSOCK_OP_SHUTDOWN|CHILD

L receives REQUEST
	C = socket()
	C.sk_state = TCP_ESTABLISHED
	add C to L.accept_queue
	add C to connected_table
	send VIRTIO_VSOCK_OP_RESPONSE

				S receives RESPONSE
					unexpected state:
					S.sk_state != TCP_SYN_SENT
					send VIRTIO_VSOCK_OP_RST

C receives SHUTDOWN
	if !flags.CHILD
		send VIRTIO_VSOCK_OP_RST
	virtio_transport_do_close()
		C.sk_state = TCP_CLOSING
	del C from connected_table
	// if flags.CHILD
	//	(schedule?) del C from L.accept_queue?
	//	racy anyway

L receives RST
	nothing (as RST reply to client's RST won't be send)

So that's all I could come up with. Please let me know what you think.

Thanks,
Michal


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ