[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8c6d237f-dde7-4922-b92d-6a638fc7376e@openvpn.net>
Date: Thu, 18 Jul 2024 12:13:38 +0200
From: Antonio Quartulli <antonio@...nvpn.net>
To: Sabrina Dubroca <sd@...asysnail.net>
Cc: netdev@...r.kernel.org, kuba@...nel.org, ryazanov.s.a@...il.com,
pabeni@...hat.com, edumazet@...gle.com, andrew@...n.ch
Subject: Re: [PATCH net-next v5 14/25] ovpn: implement TCP transport
Hi,
On 15/07/2024 11:59, Sabrina Dubroca wrote:
> 2024-06-27, 15:08:32 +0200, Antonio Quartulli wrote:
>> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
>> index 0475440642dd..764b3df996bc 100644
>> --- a/drivers/net/ovpn/io.c
>> +++ b/drivers/net/ovpn/io.c
>> @@ -21,6 +21,7 @@
>> #include "netlink.h"
>> #include "proto.h"
>> #include "socket.h"
>> +#include "tcp.h"
>> #include "udp.h"
>> #include "skb.h"
>>
>> @@ -84,8 +85,11 @@ void ovpn_decrypt_post(struct sk_buff *skb, int ret)
>> /* PID sits after the op */
>> pid = (__force __be32 *)(skb->data + OVPN_OP_SIZE_V2);
>> ret = ovpn_pktid_recv(&ks->pid_recv, ntohl(*pid), 0);
>> - if (unlikely(ret < 0))
>> + if (unlikely(ret < 0)) {
>> + net_err_ratelimited("%s: PKT ID RX error: %d\n",
>> + peer->ovpn->dev->name, ret);
>
> nit: this should be part of the "packet processing" patch?
Yap, makes sense.
>
>
>> diff --git a/drivers/net/ovpn/peer.h b/drivers/net/ovpn/peer.h
>> index dd4d91dfabb5..86d4696b1529 100644
>> --- a/drivers/net/ovpn/peer.h
>> +++ b/drivers/net/ovpn/peer.h
>> @@ -10,8 +10,8 @@
>> #ifndef _NET_OVPN_OVPNPEER_H_
>> #define _NET_OVPN_OVPNPEER_H_
>>
>> -#include <linux/ptr_ring.h>
>
> nit: I think you don't need it at all in this version and forgot to
> drop it in a previous patch? (I didn't notice when it was introduced)
Ouch, you are correct
>
>
>
>> +static int ovpn_tcp_to_userspace(struct ovpn_socket *sock, struct sk_buff *skb)
>> +{
>> + struct sock *sk = sock->sock->sk;
>> +
>> + skb_set_owner_r(skb, sk);
>> + memset(skb->cb, 0, sizeof(skb->cb));
>
> nit: this was just done in ovpn_tcp_rcv
right!
>
>> + skb_queue_tail(&sock->peer->tcp.user_queue, skb);
>> + sock->peer->tcp.sk_cb.sk_data_ready(sk);
>> +
>> + return 0;
>> +}
>> +
>> +static void ovpn_tcp_rcv(struct strparser *strp, struct sk_buff *skb)
>> +{
> [...]
>> + /* DATA_V2 packets are handled in kernel, the rest goes to user space */
>> + if (likely(ovpn_opcode_from_skb(skb, 0) == OVPN_DATA_V2)) {
>> + /* hold reference to peer as required by ovpn_recv().
>> + *
>> + * NOTE: in this context we should already be holding a
>> + * reference to this peer, therefore ovpn_peer_hold() is
>> + * not expected to fail
>> + */
>> + WARN_ON(!ovpn_peer_hold(peer));
>
> drop the packet if this fails? otherwise I suspect we'll crash later on.
yeah, jumping to "err" and dropping everything makes sense.
>
>> + ovpn_recv(peer, skb);
>> + } else {
>> + /* The packet size header must be there when sending the packet
>> + * to userspace, therefore we put it back
>> + */
>> + skb_push(skb, 2);
>> + memset(skb->cb, 0, sizeof(skb->cb));
>> + if (ovpn_tcp_to_userspace(peer->sock, skb) < 0) {
>> + net_warn_ratelimited("%s: cannot send skb to userspace\n",
>> + peer->ovpn->dev->name);
>> + goto err;
>> + }
>> + }
> [...]
>
>
>> +void ovpn_tcp_socket_detach(struct socket *sock)
>> +{
>> + struct ovpn_socket *ovpn_sock;
>> + struct ovpn_peer *peer;
>> +
>> + if (!sock)
>> + return;
>> +
>> + rcu_read_lock();
>> + ovpn_sock = rcu_dereference_sk_user_data(sock->sk);
>> +
> [...]
>> + /* cancel any ongoing work. Done after removing the CBs so that these
>> + * workers cannot be re-armed
>> + */
>> + cancel_work_sync(&peer->tcp.tx_work);
>
> I don't think that's ok to call under rcu_read_lock, it seems it can
> sleep.
>
>> + strp_done(&peer->tcp.strp);
>
> And same here, since strp_done also calls cancel_work_sync.
hm you're right. I'll see how to re-arrange this part..I expect this to
be tricky.
>
>> + rcu_read_unlock();
>> +}
>> +
>> +static void ovpn_tcp_send_sock(struct ovpn_peer *peer)
>> +{
>> + struct sk_buff *skb = peer->tcp.out_msg.skb;
>> +
>> + if (!skb)
>> + return;
>> +
>> + if (peer->tcp.tx_in_progress)
>> + return;
>> +
>> + peer->tcp.tx_in_progress = true;
>
> I'm not convinced this is safe. ovpn_tcp_send_sock could run
> concurrently for the same peer (lock_sock doesn't exclude bh_lock_sock
> after the short "grab ownership" phase), so I think both sides could
> see tx_in_progress = false and then proceed.
I may be missing something here.
I was under the impression that ovpn_tcp_send_sock() is always invoked
with lock_sock() held. Shouldn't that be enough to prevent concurrent
executions for the same peer/sock?
>
>
>> + do {
>> + int ret = skb_send_sock_locked(peer->sock->sock->sk, skb,
>> + peer->tcp.out_msg.offset,
>> + peer->tcp.out_msg.len);
>> + if (unlikely(ret < 0)) {
>> + if (ret == -EAGAIN)
>> + goto out;
>
> This will silently drop the message? And then in case of a userspace
> message, ovpn_tcp_sendmsg will lie to the user (the openvpn client),
> claiming that the control message was sent (ret = size just above the
> unlock)?
why do you think the message will be dropped?
By jumping to 'out' we are keeping the skb in peer->tcp.out_msg.skb,
with peer->tcp.out_msg.offset and peer->tcp.out_msg.len left untouched
and ready for the next attempt triggered by ovpn_tcp_write_space().
>
>> +
>> + net_warn_ratelimited("%s: TCP error to peer %u: %d\n",
>> + peer->ovpn->dev->name, peer->id,
>> + ret);
>> +
>> + /* in case of TCP error we can't recover the VPN
>> + * stream therefore we abort the connection
>> + */
>> + ovpn_peer_del(peer,
>> + OVPN_DEL_PEER_REASON_TRANSPORT_ERROR);
>> + break;
>> + }
>> +
>> + peer->tcp.out_msg.len -= ret;
>> + peer->tcp.out_msg.offset += ret;
>> + } while (peer->tcp.out_msg.len > 0);
>
> Another thing that worries me: assume the receiver is a bit slow, the
> underlying TCP socket gets stuck. skb_send_sock_locked manages to push
> some data down the TCP socket, but not everything. We advance by that
> amount, and restart this loop. The socket is still stuck, so
> skb_send_sock_locked returns -EAGAIN. We have only pushed a partial
> message down to the TCP socket, but we drop the rest? Now the stream
> is broken, and the next call to ovpn_tcp_send_sock will happily send
> its message.
I think this is answered above, where I say that we are actually keeping
the skb (not dropping it) ready for the next sending attempt.
>
> ovpn_tcp_send_sock with msg_len = 1000
> iteration 1
> skb_send_sock_locked returns 100
> advance
> iteration 2
> skb_send_sock_locked returns -EAGAIN
> goto out
>
>
> So you'd have to keep that partially-sent message around until you can
> finish pushing it out on the socket.
yap, see above.
>
>
> [...]
>> +static int ovpn_tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
>> +{
>> + struct ovpn_socket *sock;
>> + int ret, linear = PAGE_SIZE;
>> + struct ovpn_peer *peer;
>> + struct sk_buff *skb;
>> +
>> + rcu_read_lock();
>> + sock = rcu_dereference_sk_user_data(sk);
>> + peer = sock->peer;
>> + rcu_read_unlock();
>
> What's stopping the peer being freed here?
I assumed that while we are in our own sk_cb it should not be possible
for a peer to have refcnt reaching 0.
But after double checking I don't think there is any protection about
this. I Will add a call to ovpn_peer_hold() and abort if that call fails.
--
Antonio Quartulli
OpenVPN Inc.
Powered by blists - more mailing lists