[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <fa56bc54-445f-4844-8401-934bb8e41021@openvpn.net>
Date: Tue, 17 Dec 2024 01:40:41 +0100
From: Antonio Quartulli <antonio@...nvpn.net>
To: Sabrina Dubroca <sd@...asysnail.net>
Cc: netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Donald Hunter <donald.hunter@...il.com>, Shuah Khan <shuah@...nel.org>,
ryazanov.s.a@...il.com, Andrew Lunn <andrew+netdev@...n.ch>,
Simon Horman <horms@...nel.org>, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org, Xiao Liang <shaw.leon@...il.com>,
willemdebruijn.kernel@...il.com
Subject: Re: [PATCH net-next v15 06/22] ovpn: introduce the ovpn_socket object
On 16/12/2024 12:50, Antonio Quartulli wrote:
> On 16/12/2024 12:09, Sabrina Dubroca wrote:
> [...]
>>> Maybe we should call cancel_sync_work(&ovpn_sock->work) inside
>>> ovpn_socket_get()?
>>> So the latter will return NULL only when it is sure that the socket
>>> has been
>>> detached.
>>>
>>> At that point we can skip the following return and continue along the
>>> "new
>>> socket" path.
>>>
>>> What do you think?
>>
>> The work may not have been scheduled yet? (small window between the
>> last kref_put and schedule_work)
>>
>> Maybe a completion [Documentation/scheduler/completion.rst] would
>> solve it (but it makes things even more complex, unfortunately):
>>
>> - at the end of ovpn_socket_detach: complete(&ovpn_sock->detached);
>> - in ovpn_socket_new when handling EALREADY:
>> wait_for_completion(&ovpn_sock->detached);
>> - in ovpn_socket_new for the new socket: init_completion(&ovpn_sock-
>> >detached);
>>
>> but ovpn_sock could be gone immediately after complete(). Maybe
>> something with completion_done() before the kfree_rcu in
>> ovpn_socket_detach? I'm not that familiar with the completion API.
>>
>
> It seems the solution we are aiming for is more complex than the concept
> of ovpn_socket per se :-D
>
> I'll think a bit more about this..maybe we can avoid entering this
> situation at all..
I see that there are some kref_put variants that acquire a lock just
before hitting zero and running the release cb.
If I implement a kref_put variant that acquires the lock_sock, I could
then perform the udp detach under lock, thus ensuring that zero'ing the
refcount and erasing the sk_user_data happens while holding the lock_sock.
This way I should be able to prevent the situation where "sk_user_data
still says EALREADY, but the refcnt is actually 0".
I hope adding this new API is fine.
I am giving it a try now.
Regards,
--
Antonio Quartulli
OpenVPN Inc.
Powered by blists - more mailing lists