netdev - Re: [PATCH net-next v5 14/25] ovpn: implement TCP transport

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZpTy860ss-JwT_2W@hog>
Date: Mon, 15 Jul 2024 11:59:15 +0200
From: Sabrina Dubroca <sd@...asysnail.net>
To: Antonio Quartulli <antonio@...nvpn.net>
Cc: netdev@...r.kernel.org, kuba@...nel.org, ryazanov.s.a@...il.com,
	pabeni@...hat.com, edumazet@...gle.com, andrew@...n.ch
Subject: Re: [PATCH net-next v5 14/25] ovpn: implement TCP transport

2024-06-27, 15:08:32 +0200, Antonio Quartulli wrote:
> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
> index 0475440642dd..764b3df996bc 100644
> --- a/drivers/net/ovpn/io.c
> +++ b/drivers/net/ovpn/io.c
> @@ -21,6 +21,7 @@
>  #include "netlink.h"
>  #include "proto.h"
>  #include "socket.h"
> +#include "tcp.h"
>  #include "udp.h"
>  #include "skb.h"
>  
> @@ -84,8 +85,11 @@ void ovpn_decrypt_post(struct sk_buff *skb, int ret)
>  	/* PID sits after the op */
>  	pid = (__force __be32 *)(skb->data + OVPN_OP_SIZE_V2);
>  	ret = ovpn_pktid_recv(&ks->pid_recv, ntohl(*pid), 0);
> -	if (unlikely(ret < 0))
> +	if (unlikely(ret < 0)) {
> +		net_err_ratelimited("%s: PKT ID RX error: %d\n",
> +				    peer->ovpn->dev->name, ret);

nit: this should be part of the "packet processing" patch?


> diff --git a/drivers/net/ovpn/peer.h b/drivers/net/ovpn/peer.h
> index dd4d91dfabb5..86d4696b1529 100644
> --- a/drivers/net/ovpn/peer.h
> +++ b/drivers/net/ovpn/peer.h
> @@ -10,8 +10,8 @@
>  #ifndef _NET_OVPN_OVPNPEER_H_
>  #define _NET_OVPN_OVPNPEER_H_
>  
> -#include <linux/ptr_ring.h>

nit: I think you don't need it at all in this version and forgot to
drop it in a previous patch? (I didn't notice when it was introduced)



> +static int ovpn_tcp_to_userspace(struct ovpn_socket *sock, struct sk_buff *skb)
> +{
> +	struct sock *sk = sock->sock->sk;
> +
> +	skb_set_owner_r(skb, sk);
> +	memset(skb->cb, 0, sizeof(skb->cb));

nit: this was just done in ovpn_tcp_rcv

> +	skb_queue_tail(&sock->peer->tcp.user_queue, skb);
> +	sock->peer->tcp.sk_cb.sk_data_ready(sk);
> +
> +	return 0;
> +}
> +
> +static void ovpn_tcp_rcv(struct strparser *strp, struct sk_buff *skb)
> +{
[...]
> +	/* DATA_V2 packets are handled in kernel, the rest goes to user space */
> +	if (likely(ovpn_opcode_from_skb(skb, 0) == OVPN_DATA_V2)) {
> +		/* hold reference to peer as required by ovpn_recv().
> +		 *
> +		 * NOTE: in this context we should already be holding a
> +		 * reference to this peer, therefore ovpn_peer_hold() is
> +		 * not expected to fail
> +		 */
> +		WARN_ON(!ovpn_peer_hold(peer));

drop the packet if this fails? otherwise I suspect we'll crash later on.

> +		ovpn_recv(peer, skb);
> +	} else {
> +		/* The packet size header must be there when sending the packet
> +		 * to userspace, therefore we put it back
> +		 */
> +		skb_push(skb, 2);
> +		memset(skb->cb, 0, sizeof(skb->cb));
> +		if (ovpn_tcp_to_userspace(peer->sock, skb) < 0) {
> +			net_warn_ratelimited("%s: cannot send skb to userspace\n",
> +					     peer->ovpn->dev->name);
> +			goto err;
> +		}
> +	}
[...]


> +void ovpn_tcp_socket_detach(struct socket *sock)
> +{
> +	struct ovpn_socket *ovpn_sock;
> +	struct ovpn_peer *peer;
> +
> +	if (!sock)
> +		return;
> +
> +	rcu_read_lock();
> +	ovpn_sock = rcu_dereference_sk_user_data(sock->sk);
> +
[...]
> +	/* cancel any ongoing work. Done after removing the CBs so that these
> +	 * workers cannot be re-armed
> +	 */
> +	cancel_work_sync(&peer->tcp.tx_work);

I don't think that's ok to call under rcu_read_lock, it seems it can
sleep.

> +	strp_done(&peer->tcp.strp);

And same here, since strp_done also calls cancel_work_sync.

> +	rcu_read_unlock();
> +}
> +
> +static void ovpn_tcp_send_sock(struct ovpn_peer *peer)
> +{
> +	struct sk_buff *skb = peer->tcp.out_msg.skb;
> +
> +	if (!skb)
> +		return;
> +
> +	if (peer->tcp.tx_in_progress)
> +		return;
> +
> +	peer->tcp.tx_in_progress = true;

I'm not convinced this is safe. ovpn_tcp_send_sock could run
concurrently for the same peer (lock_sock doesn't exclude bh_lock_sock
after the short "grab ownership" phase), so I think both sides could
see tx_in_progress = false and then proceed.


> +	do {
> +		int ret = skb_send_sock_locked(peer->sock->sock->sk, skb,
> +					       peer->tcp.out_msg.offset,
> +					       peer->tcp.out_msg.len);
> +		if (unlikely(ret < 0)) {
> +			if (ret == -EAGAIN)
> +				goto out;

This will silently drop the message? And then in case of a userspace
message, ovpn_tcp_sendmsg will lie to the user (the openvpn client),
claiming that the control message was sent (ret = size just above the
unlock)?

> +
> +			net_warn_ratelimited("%s: TCP error to peer %u: %d\n",
> +					     peer->ovpn->dev->name, peer->id,
> +					     ret);
> +
> +			/* in case of TCP error we can't recover the VPN
> +			 * stream therefore we abort the connection
> +			 */
> +			ovpn_peer_del(peer,
> +				      OVPN_DEL_PEER_REASON_TRANSPORT_ERROR);
> +			break;
> +		}
> +
> +		peer->tcp.out_msg.len -= ret;
> +		peer->tcp.out_msg.offset += ret;
> +	} while (peer->tcp.out_msg.len > 0);

Another thing that worries me: assume the receiver is a bit slow, the
underlying TCP socket gets stuck. skb_send_sock_locked manages to push
some data down the TCP socket, but not everything. We advance by that
amount, and restart this loop. The socket is still stuck, so
skb_send_sock_locked returns -EAGAIN. We have only pushed a partial
message down to the TCP socket, but we drop the rest? Now the stream
is broken, and the next call to ovpn_tcp_send_sock will happily send
its message.

ovpn_tcp_send_sock with msg_len = 1000
iteration 1
  skb_send_sock_locked returns 100
  advance
iteration 2
  skb_send_sock_locked returns -EAGAIN
  goto out


So you'd have to keep that partially-sent message around until you can
finish pushing it out on the socket.


[...]
> +static int ovpn_tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> +{
> +	struct ovpn_socket *sock;
> +	int ret, linear = PAGE_SIZE;
> +	struct ovpn_peer *peer;
> +	struct sk_buff *skb;
> +
> +	rcu_read_lock();
> +	sock = rcu_dereference_sk_user_data(sk);
> +	peer = sock->peer;
> +	rcu_read_unlock();

What's stopping the peer being freed here?

-- 
Sabrina