[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eb9558b3-cd7e-4da6-a496-adca6132a601@openvpn.net>
Date: Fri, 10 May 2024 16:41:43 +0200
From: Antonio Quartulli <antonio@...nvpn.net>
To: Sabrina Dubroca <sd@...asysnail.net>
Cc: netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
Sergey Ryazanov <ryazanov.s.a@...il.com>, Paolo Abeni <pabeni@...hat.com>,
Eric Dumazet <edumazet@...gle.com>, Andrew Lunn <andrew@...n.ch>,
Esben Haabendal <esben@...nix.com>
Subject: Re: [PATCH net-next v3 10/24] ovpn: implement basic RX path (UDP)
On 10/05/2024 15:45, Sabrina Dubroca wrote:
> 2024-05-06, 03:16:23 +0200, Antonio Quartulli wrote:
>> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
>> index 36cfb95edbf4..9935a863bffe 100644
>> --- a/drivers/net/ovpn/io.c
>> +++ b/drivers/net/ovpn/io.c
>> +/* Called after decrypt to write the IP packet to the device.
>> + * This method is expected to manage/free the skb.
>> + */
>> +static void ovpn_netdev_write(struct ovpn_peer *peer, struct sk_buff *skb)
>> +{
>> + /* packet integrity was verified on the VPN layer - no need to perform
>> + * any additional check along the stack
>
> But it could have been corrupted before it got into the VPN?
It could, but I believe a VPN should only take care of integrity along
its tunnel (and this is guaranteed by the OpenVPN protocol).
If something corrupted enters the tunnel, we will just deliver it as is
to the other end. Upper layers (where the corruption actually happened)
have to deal with that.
>
>> + */
>> + skb->ip_summed = CHECKSUM_UNNECESSARY;
>> + skb->csum_level = ~0;
>> +
>
> [...]
>> +int ovpn_napi_poll(struct napi_struct *napi, int budget)
>> +{
>> + struct ovpn_peer *peer = container_of(napi, struct ovpn_peer, napi);
>> + struct sk_buff *skb;
>> + int work_done = 0;
>> +
>> + if (unlikely(budget <= 0))
>> + return 0;
>> + /* this function should schedule at most 'budget' number of
>> + * packets for delivery to the interface.
>> + * If in the queue we have more packets than what allowed by the
>> + * budget, the next polling will take care of those
>> + */
>> + while ((work_done < budget) &&
>> + (skb = ptr_ring_consume_bh(&peer->netif_rx_ring))) {
>> + ovpn_netdev_write(peer, skb);
>> + work_done++;
>> + }
>> +
>> + if (work_done < budget)
>> + napi_complete_done(napi, work_done);
>> +
>> + return work_done;
>> +}
>
> Why not use gro_cells?
First because I did not know they existed :-)
> It would avoid all that napi polling and
> netif_rx_ring code (and it's per-cpu, going back to our other
> discussion around napi).
This sounds truly appealing. And if we can make this per-cpu by design,
I believe we can definitely drop the per-peer NAPI logic.
>
>
>> diff --git a/drivers/net/ovpn/proto.h b/drivers/net/ovpn/proto.h
>> new file mode 100644
>> index 000000000000..0a51104ed931
>> --- /dev/null
>> +++ b/drivers/net/ovpn/proto.h
> [...]
>> +/**
>> + * ovpn_key_id_from_skb - extract key ID from the skb head
>> + * @skb: the packet to extract the key ID code from
>> + *
>> + * Note: this function assumes that the skb head was pulled enough
>> + * to access the first byte.
>> + *
>> + * Return: the key ID
>> + */
>> +static inline u8 ovpn_key_id_from_skb(const struct sk_buff *skb)
>
>> +static inline u32 ovpn_opcode_compose(u8 opcode, u8 key_id, u32 peer_id)
>
> (tiny nit: those aren't used yet in this patch. probably not worth
> moving them into the right patch.)
ouch. I am already going at a speed of 20-25rph (Rebases Per Hour).
It shouldn't be a problem to clean this up too.
>
>
>> diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c
>> index f434da76dc0a..07182703e598 100644
>> --- a/drivers/net/ovpn/udp.c
>> +++ b/drivers/net/ovpn/udp.c
>> @@ -20,9 +20,117 @@
>> #include "bind.h"
>> #include "io.h"
>> #include "peer.h"
>> +#include "proto.h"
>> #include "socket.h"
>> #include "udp.h"
>>
>> +/**
>> + * ovpn_udp_encap_recv - Start processing a received UDP packet.
>> + * @sk: socket over which the packet was received
>> + * @skb: the received packet
>> + *
>> + * If the first byte of the payload is DATA_V2, the packet is further processed,
>> + * otherwise it is forwarded to the UDP stack for delivery to user space.
>> + *
>> + * Return:
>> + * 0 if skb was consumed or dropped
>> + * >0 if skb should be passed up to userspace as UDP (packet not consumed)
>> + * <0 if skb should be resubmitted as proto -N (packet not consumed)
>> + */
>> +static int ovpn_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
>> +{
>> + struct ovpn_peer *peer = NULL;
>> + struct ovpn_struct *ovpn;
>> + u32 peer_id;
>> + u8 opcode;
>> + int ret;
>> +
>> + ovpn = ovpn_from_udp_sock(sk);
>> + if (unlikely(!ovpn)) {
>> + net_err_ratelimited("%s: cannot obtain ovpn object from UDP socket\n",
>> + __func__);
>> + goto drop;
>> + }
>> +
>> + /* Make sure the first 4 bytes of the skb data buffer after the UDP
>> + * header are accessible.
>> + * They are required to fetch the OP code, the key ID and the peer ID.
>> + */
>> + if (unlikely(!pskb_may_pull(skb, sizeof(struct udphdr) + 4))) {
>
> Is this OVPN_OP_SIZE_V2?
It is! I will use that define. thanks
>
>> + net_dbg_ratelimited("%s: packet too small\n", __func__);
>> + goto drop;
>> + }
>> +
>> + opcode = ovpn_opcode_from_skb(skb, sizeof(struct udphdr));
>> + if (unlikely(opcode != OVPN_DATA_V2)) {
>> + /* DATA_V1 is not supported */
>> + if (opcode == OVPN_DATA_V1)
>> + goto drop;
>> +
>> + /* unknown or control packet: let it bubble up to userspace */
>> + return 1;
>> + }
>> +
>> + peer_id = ovpn_peer_id_from_skb(skb, sizeof(struct udphdr));
>> + /* some OpenVPN server implementations send data packets with the
>> + * peer-id set to undef. In this case we skip the peer lookup by peer-id
>> + * and we try with the transport address
>> + */
>> + if (peer_id != OVPN_PEER_ID_UNDEF) {
>> + peer = ovpn_peer_get_by_id(ovpn, peer_id);
>> + if (!peer) {
>> + net_err_ratelimited("%s: received data from unknown peer (id: %d)\n",
>> + __func__, peer_id);
>> + goto drop;
>> + }
>> + }
>> +
>> + if (!peer) {
>> + /* data packet with undef peer-id */
>> + peer = ovpn_peer_get_by_transp_addr(ovpn, skb);
>> + if (unlikely(!peer)) {
>> + netdev_dbg(ovpn->dev,
>> + "%s: received data with undef peer-id from unknown source\n",
>> + __func__);
>
> _ratelimited?
makes sense. will use net_dbg_ratelimited
>
>> + goto drop;
>> + }
>> + }
>> +
>> + /* At this point we know the packet is from a configured peer.
>> + * DATA_V2 packets are handled in kernel space, the rest goes to user
>> + * space.
>> + *
>> + * Return 1 to instruct the stack to let the packet bubble up to
>> + * userspace
>> + */
>> + if (unlikely(opcode != OVPN_DATA_V2)) {
>
> You already handled those earlier, before getting the peer.
ouch..you're right. This can just go.
>
>
> [...]
>> @@ -255,10 +368,20 @@ int ovpn_udp_socket_attach(struct socket *sock, struct ovpn_struct *ovpn)
>> return -EALREADY;
>> }
>>
>> - netdev_err(ovpn->dev, "%s: provided socket already taken by other user\n",
>> + netdev_err(ovpn->dev,
>> + "%s: provided socket already taken by other user\n",
>
> I guess you meant to break that line in the patch that introduced it,
> rather than here? :)
indeed.
>
>
>> +void ovpn_udp_socket_detach(struct socket *sock)
>> +{
>> + struct udp_tunnel_sock_cfg cfg = { };
>> +
>> + setup_udp_tunnel_sock(sock_net(sock->sk), sock, &cfg);
>
> I can't find anything in the kernel currently using
> setup_udp_tunnel_sock the way you're using it here.
>
> Does this provide any benefit compared to just letting the kernel
> disable encap when the socket goes away? Are you planning to detach
> and then re-attach the same socket?
Technically, we don't know what happens to this socket after we detach.
We have no guarantee that it will be closed.
Right now we detach when the instance is closed, so it's likely that the
socket will go, but I don't want to make hard assumptions about what
userspace may decide to do with this socket in the future.
If it doesn't hurt, why not doing this easy cleanup?
Thanks!
>
--
Antonio Quartulli
OpenVPN Inc.
Powered by blists - more mailing lists