[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24efb98c-d6ba-42c2-91b7-71b969179aff@rbox.co>
Date: Fri, 14 Mar 2025 16:29:03 +0100
From: Michal Luczaj <mhal@...x.co>
To: John Fastabend <john.fastabend@...il.com>
Cc: Stefano Garzarella <sgarzare@...hat.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
Bobby Eshleman <bobby.eshleman@...edance.com>,
"Michael S. Tsirkin" <mst@...hat.com>, netdev@...r.kernel.org,
bpf@...r.kernel.org
Subject: Re: [PATCH net] vsock/bpf: Handle EINTR connect() racing against
sockmap update
On 3/11/25 17:23, John Fastabend wrote:
> On 2025-03-07 10:27:50, Michal Luczaj wrote:
>> Signal delivered during connect() may result in a disconnect of an already
>> TCP_ESTABLISHED socket. Problem is that such established socket might have
>> been placed in a sockmap before the connection was closed. We end up with a
>> SS_UNCONNECTED vsock in a sockmap. And this, combined with the ability to
>> reassign (unconnected) vsock's transport to NULL, breaks the sockmap
>> contract. As manifested by WARN_ON_ONCE.
>>
>> Ensure the socket does not stay in sockmap.
>>
>> WARNING: CPU: 10 PID: 1310 at net/vmw_vsock/vsock_bpf.c:90 vsock_bpf_recvmsg+0xb4b/0xdf0
>> CPU: 10 UID: 0 PID: 1310 Comm: a.out Tainted: G W 6.14.0-rc4+
>> sock_recvmsg+0x1b2/0x220
>> __sys_recvfrom+0x190/0x270
>> __x64_sys_recvfrom+0xdc/0x1b0
>> do_syscall_64+0x93/0x1b0
>> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> Fixes: 634f1a7110b4 ("vsock: support sockmap")
>> Signed-off-by: Michal Luczaj <mhal@...x.co>
>
> Hi Michal,
>
> Unhashing the socket to stop any references from sockmap side if the
> sock is being put into CLOSING state makes sense to me. Was there
> another v2 somewhere? I didn't see it in my inbox or I missed it.
> I think you mentioned more fixes were needed.
Great, thanks for checking. I was worried I might be missing some
subtleties of sock_map_unhash() not calling `sk_psock_stop(psock)` nor
`cancel_delayed_work_sync(&psock->work)`. Especially since user still has
socket descriptor open and can play with such "unhashed" socket.
I've just sent v2: https://lore.kernel.org/netdev/20250314-vsock-trans-signal-race-v2-0-421a41f60f42@rbox.co/
Repro is adapted to sockmap_basic. And to answer your question from
another thread: test triggers warning in a second. Currently timeout is 2s.
I'm not sure how useful it may be for other families, but let me know if
you'd rather have it somehow more generic.
Thanks,
Michal
Powered by blists - more mailing lists