[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1a47ce02-fd42-4761-8697-f3f315011cc6@redhat.com>
Date: Tue, 13 May 2025 10:21:34 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Jakub Kicinski <kuba@...nel.org>, Antonio Quartulli <antonio@...nvpn.net>
Cc: netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>,
Sabrina Dubroca <sd@...asysnail.net>, Al Viro <viro@...iv.linux.org.uk>,
Qingfang Deng <dqfext@...il.com>, Gert Doering <gert@...enie.muc.de>
Subject: Re: [PATCH net-next 10/10] ovpn: ensure sk is still valid during
cleanup
On 5/13/25 3:37 AM, Jakub Kicinski wrote:
> On Fri, 9 May 2025 16:26:20 +0200 Antonio Quartulli wrote:
>> In case of UDP peer timeout, an openvpn client (userspace)
>> performs the following actions:
>> 1. receives the peer deletion notification (reason=timeout)
>> 2. closes the socket
>>
>> Upon 1. we have the following:
>> - ovpn_peer_keepalive_work()
>> - ovpn_socket_release()
>> - synchronize_rcu()
>> At this point, 2. gets a chance to complete and ovpn_sock->sock->sk
>> becomes NULL. ovpn_socket_release() will then attempt dereferencing it,
>> resulting in the following crash log:
>
> What runs where is a bit unclear to me. Specifically I'm not sure what
> runs the code under the "if (released)" branch of ovpn_socket_release()
> if the user closes the socket. Because you now return without a WARN().
>
>> @@ -75,13 +76,14 @@ void ovpn_socket_release(struct ovpn_peer *peer)
>> if (!sock)
>> return;
>>
>> - /* sanity check: we should not end up here if the socket
>> - * was already closed
>> + /* sock->sk may be released concurrently, therefore we
>> + * first attempt grabbing a reference.
>> + * if sock->sk is NULL it means it is already being
>> + * destroyed and we don't need any further cleanup
>> */
>> - if (!sock->sock->sk) {
>> - DEBUG_NET_WARN_ON_ONCE(1);
>> + sk = sock->sock->sk;
>> + if (!sk || !refcount_inc_not_zero(&sk->sk_refcnt))
>
> How is sk protected from getting reused here?
> refcount_inc_not_zero() still needs the underlying object to be allocated.
> I don't see any locking here, and code says this function may sleep so
> it can't be called under RCU, either.
I agree this still looks racy. When the socket close runs, nobody else
should have access/reference to the 'struct socket'. I'm under the
impression that ovpn_socket should acquire references to the underlying
fd instead of keeping its own refcount.
Side note: the ovpn_socket refcount release/detach path looks wrong, at
least in case of an UDP socket, as ovpn_udp_socket_detach() calls
setup_udp_tunnel_sock() which in turns will try to _increment_ various
core counters, instead of decreasing them (i.e. udp_encap_enable should
be wrongly accounted after that call).
/P
Powered by blists - more mailing lists