[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250509142630.6947-11-antonio@openvpn.net>
Date: Fri, 9 May 2025 16:26:20 +0200
From: Antonio Quartulli <antonio@...nvpn.net>
To: netdev@...r.kernel.org
Cc: Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Sabrina Dubroca <sd@...asysnail.net>,
Antonio Quartulli <antonio@...nvpn.net>,
Al Viro <viro@...iv.linux.org.uk>,
Qingfang Deng <dqfext@...il.com>,
Gert Doering <gert@...enie.muc.de>
Subject: [PATCH net-next 10/10] ovpn: ensure sk is still valid during cleanup
In case of UDP peer timeout, an openvpn client (userspace)
performs the following actions:
1. receives the peer deletion notification (reason=timeout)
2. closes the socket
Upon 1. we have the following:
- ovpn_peer_keepalive_work()
- ovpn_socket_release()
- synchronize_rcu()
At this point, 2. gets a chance to complete and ovpn_sock->sock->sk
becomes NULL. ovpn_socket_release() will then attempt dereferencing it,
resulting in the following crash log:
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000077: 0000 [#1] SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000003b8-0x00000000000003bf]
CPU: 12 UID: 0 PID: 162 Comm: kworker/12:1 Tainted: G O 6.15.0-rc2-00635-g521139ac3840 #272 PREEMPT(full)
Tainted: [O]=OOT_MODULE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-20240910_120124-localhost 04/01/2014
Workqueue: events ovpn_peer_keepalive_work [ovpn]
RIP: 0010:ovpn_socket_release+0x23c/0x500 [ovpn]
Code: ea 03 80 3c 02 00 0f 85 71 02 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 64 24 18 49 8d bc 24 be 03 00 00 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 30
RSP: 0018:ffffc90000c9fb18 EFLAGS: 00010217
RAX: dffffc0000000000 RBX: ffff8881148d7940 RCX: ffffffff817787bb
RDX: 0000000000000077 RSI: 0000000000000008 RDI: 00000000000003be
RBP: ffffc90000c9fb30 R08: 0000000000000000 R09: fffffbfff0d3e840
R10: ffffffff869f4207 R11: 0000000000000000 R12: 0000000000000000
R13: ffff888115eb9300 R14: ffffc90000c9fbc8 R15: 000000000000000c
FS: 0000000000000000(0000) GS:ffff8882b0151000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f37266b6114 CR3: 00000000054a8000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<TASK>
unlock_ovpn+0x8b/0xe0 [ovpn]
ovpn_peer_keepalive_work+0xe3/0x540 [ovpn]
? ovpn_peers_free+0x780/0x780 [ovpn]
? lock_acquire+0x56/0x70
? process_one_work+0x888/0x1740
process_one_work+0x933/0x1740
? pwq_dec_nr_in_flight+0x10b0/0x10b0
? move_linked_works+0x12d/0x2c0
? assign_work+0x163/0x270
worker_thread+0x4d6/0xd90
? preempt_count_sub+0x4c/0x70
? process_one_work+0x1740/0x1740
kthread+0x36c/0x710
? trace_preempt_on+0x8c/0x1e0
? kthread_is_per_cpu+0xc0/0xc0
? preempt_count_sub+0x4c/0x70
? _raw_spin_unlock_irq+0x36/0x60
? calculate_sigpending+0x7b/0xa0
? kthread_is_per_cpu+0xc0/0xc0
ret_from_fork+0x3a/0x80
? kthread_is_per_cpu+0xc0/0xc0
ret_from_fork_asm+0x11/0x20
</TASK>
Modules linked in: ovpn(O)
Reason for accessing sk is ithat we need to retrieve its
protocol and continue the cleanup routine accordingly.
Fix the crash by grabbing a reference to sk before proceeding
with the cleanup. If the refcounter has reached zero, we
know that the socket is being destroyed and thus we skip
the cleanup in ovpn_socket_release().
Fixes: 11851cbd60ea ("ovpn: implement TCP transport")
Cc: Al Viro <viro@...iv.linux.org.uk>
Cc: Eric Dumazet <edumazet@...gle.com>
Reported-by: Qingfang Deng <dqfext@...il.com>
Tested-By: Gert Doering <gert@...enie.muc.de>
Signed-off-by: Antonio Quartulli <antonio@...nvpn.net>
---
drivers/net/ovpn/socket.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ovpn/socket.c b/drivers/net/ovpn/socket.c
index a83cbab72591..66a2ecbc483b 100644
--- a/drivers/net/ovpn/socket.c
+++ b/drivers/net/ovpn/socket.c
@@ -66,6 +66,7 @@ static bool ovpn_socket_put(struct ovpn_peer *peer, struct ovpn_socket *sock)
void ovpn_socket_release(struct ovpn_peer *peer)
{
struct ovpn_socket *sock;
+ struct sock *sk;
bool released;
might_sleep();
@@ -75,13 +76,14 @@ void ovpn_socket_release(struct ovpn_peer *peer)
if (!sock)
return;
- /* sanity check: we should not end up here if the socket
- * was already closed
+ /* sock->sk may be released concurrently, therefore we
+ * first attempt grabbing a reference.
+ * if sock->sk is NULL it means it is already being
+ * destroyed and we don't need any further cleanup
*/
- if (!sock->sock->sk) {
- DEBUG_NET_WARN_ON_ONCE(1);
+ sk = sock->sock->sk;
+ if (!sk || !refcount_inc_not_zero(&sk->sk_refcnt))
return;
- }
/* Drop the reference while holding the sock lock to avoid
* concurrent ovpn_socket_new call to mess up with a partially
@@ -90,18 +92,18 @@ void ovpn_socket_release(struct ovpn_peer *peer)
* Holding the lock ensures that a socket with refcnt 0 is fully
* detached before it can be picked by a concurrent reader.
*/
- lock_sock(sock->sock->sk);
+ lock_sock(sk);
released = ovpn_socket_put(peer, sock);
- release_sock(sock->sock->sk);
+ release_sock(sk);
/* align all readers with sk_user_data being NULL */
synchronize_rcu();
/* following cleanup should happen with lock released */
if (released) {
- if (sock->sock->sk->sk_protocol == IPPROTO_UDP) {
+ if (sk->sk_protocol == IPPROTO_UDP) {
netdev_put(sock->ovpn->dev, &sock->dev_tracker);
- } else if (sock->sock->sk->sk_protocol == IPPROTO_TCP) {
+ } else if (sk->sk_protocol == IPPROTO_TCP) {
/* wait for TCP jobs to terminate */
ovpn_tcp_socket_wait_finish(sock);
ovpn_peer_put(sock->peer);
@@ -111,6 +113,7 @@ void ovpn_socket_release(struct ovpn_peer *peer)
*/
kfree(sock);
}
+ sock_put(sk);
}
static bool ovpn_socket_hold(struct ovpn_socket *sock)
--
2.49.0
Powered by blists - more mailing lists