[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <73a305c5-57c1-40d9-825e-9e8390e093db@openvpn.net>
Date: Wed, 17 Jul 2024 17:30:17 +0200
From: Antonio Quartulli <antonio@...nvpn.net>
To: Sabrina Dubroca <sd@...asysnail.net>
Cc: netdev@...r.kernel.org, kuba@...nel.org, ryazanov.s.a@...il.com,
pabeni@...hat.com, edumazet@...gle.com, andrew@...n.ch
Subject: Re: [PATCH net-next v5 17/25] ovpn: implement keepalive mechanism
Hi,
On 15/07/2024 16:44, Sabrina Dubroca wrote:
> 2024-06-27, 15:08:35 +0200, Antonio Quartulli wrote:
>> +static const unsigned char ovpn_keepalive_message[] = {
>> + 0x2a, 0x18, 0x7b, 0xf3, 0x64, 0x1e, 0xb4, 0xcb,
>> + 0x07, 0xed, 0x2d, 0x0a, 0x98, 0x1f, 0xc7, 0x48
>> +};
>> +
>> +/**
>> + * ovpn_is_keepalive - check if skb contains a keepalive message
>> + * @skb: packet to check
>> + *
>> + * Assumes that the first byte of skb->data is defined.
>> + *
>> + * Return: true if skb contains a keepalive or false otherwise
>> + */
>> +static bool ovpn_is_keepalive(struct sk_buff *skb)
>> +{
>> + if (*skb->data != OVPN_KEEPALIVE_FIRST_BYTE)
>
> You could use ovpn_keepalive_message[0], and then you wouldn't need
> this extra constant.
Indeed, shame on me, will do as suggested
>
>> + return false;
>> +
>> + if (!pskb_may_pull(skb, sizeof(ovpn_keepalive_message)))
>> + return false;
>> +
>> + return !memcmp(skb->data, ovpn_keepalive_message,
>> + sizeof(ovpn_keepalive_message));
>
> Is a packet that contains some extra bytes after the exact keepalive
> considered a valid keepalive, or does it need to be the correct
> length?
I checked the userspace code and it assumes the length of the received
keepalive message to be the same as the ovpn_keepalive_message array.
So no extra byte expected, otherwise the message is not considered a
keepalive anymore.
This means I must add an extra check before the memcmp to make sure
there is no extra data.
Good catch, thanks!
>
>> +}
>> +
>> /* Called after decrypt to write the IP packet to the device.
>> * This method is expected to manage/free the skb.
>> */
>> @@ -91,6 +116,9 @@ void ovpn_decrypt_post(struct sk_buff *skb, int ret)
>> goto drop;
>> }
>>
>> + /* note event of authenticated packet received for keepalive */
>> + ovpn_peer_keepalive_recv_reset(peer);
>> +
>> /* point to encapsulated IP packet */
>> __skb_pull(skb, ovpn_skb_cb(skb)->payload_offset);
>>
>> @@ -107,6 +135,12 @@ void ovpn_decrypt_post(struct sk_buff *skb, int ret)
>> goto drop;
>> }
>>
>> + if (ovpn_is_keepalive(skb)) {
>> + netdev_dbg(peer->ovpn->dev,
>> + "ping received from peer %u\n", peer->id);
>
> That should probably be _ratelimited, but it seems we don't have
> _ratelimited variants for the netdev_* helpers.
Right.
I have used the net_*_ratelimited() variants when needed.
Too bad we don't have those.
>
>
>
>> +/**
>> + * ovpn_xmit_special - encrypt and transmit an out-of-band message to peer
>> + * @peer: peer to send the message to
>> + * @data: message content
>> + * @len: message length
>> + *
>> + * Assumes that caller holds a reference to peer
>> + */
>> +static void ovpn_xmit_special(struct ovpn_peer *peer, const void *data,
>> + const unsigned int len)
>> +{
>> + struct ovpn_struct *ovpn;
>> + struct sk_buff *skb;
>> +
>> + ovpn = peer->ovpn;
>> + if (unlikely(!ovpn))
>> + return;
>> +
>> + skb = alloc_skb(256 + len, GFP_ATOMIC);
>
> Where is that 256 coming from?
"Reasonable number" which should be enough[tm] to hold the entire packet.
>
>> + if (unlikely(!skb))
>> + return;
>
> Failure to send a keepalive should probably have a counter, to help
> users troubleshoot why their connection dropped.
> (can be done later unless someone insists)
This will be part of a more sophisticated error counting that I will
introduce later on.
>
>
>> + skb_reserve(skb, 128);
>
> And that 128?
same "logic" as 256.
>
>> + skb->priority = TC_PRIO_BESTEFFORT;
>> + memcpy(__skb_put(skb, len), data, len);
>
> nit: that's __skb_put_data
oh cool, thanks!
>
>> + /* increase reference counter when passing peer to sending queue */
>> + if (!ovpn_peer_hold(peer)) {
>> + netdev_dbg(ovpn->dev, "%s: cannot hold peer reference for sending special packet\n",
>> + __func__);
>> + kfree_skb(skb);
>> + return;
>> + }
>> +
>> + ovpn_send(ovpn, skb, peer);
>> +}
>> +
>> +/**
>> + * ovpn_keepalive_xmit - send keepalive message to peer
>> + * @peer: the peer to send the message to
>> + */
>> +void ovpn_keepalive_xmit(struct ovpn_peer *peer)
>> +{
>> + ovpn_xmit_special(peer, ovpn_keepalive_message,
>> + sizeof(ovpn_keepalive_message));
>> +}
>
> I don't see other users of ovpn_xmit_special in this series, if you
> don't have more planned in the future you could drop the extra function.
initially there were plans, but I have always fought back any idea about
adding more unnecessary logic to the kernel side. So for now there is
nothing planned.
I'll remove the extra wrapper.
>
>
>> +/**
>> + * ovpn_peer_expire - timer task for incoming keepialive timeout
>
> typo: s/keepialive/keepalive/
Thanks
>
>
>
>> +/**
>> + * ovpn_peer_keepalive_set - configure keepalive values for peer
>> + * @peer: the peer to configure
>> + * @interval: outgoing keepalive interval
>> + * @timeout: incoming keepalive timeout
>> + */
>> +void ovpn_peer_keepalive_set(struct ovpn_peer *peer, u32 interval, u32 timeout)
>> +{
>> + u32 delta;
>> +
>> + netdev_dbg(peer->ovpn->dev,
>> + "%s: scheduling keepalive for peer %u: interval=%u timeout=%u\n",
>> + __func__, peer->id, interval, timeout);
>> +
>> + peer->keepalive_interval = interval;
>> + if (interval > 0) {
>> + delta = msecs_to_jiffies(interval * MSEC_PER_SEC);
>> + mod_timer(&peer->keepalive_xmit, jiffies + delta);
>
> Maybe something to consider in the future: this could be resetting a
> timer that was just about to go off to a somewhat distant time in the
> future. Not sure the peer will be happy about that (and not consider
> it a timeout).
Normally this timer is only set upon connection, or maybe upon some
future parameter exchange. In both cases we can assume the connection is
alive, so this case should not scare us.
But thanks for pointing it out
>
>> + } else {
>> + timer_delete(&peer->keepalive_xmit);
>> + }
>> +
>> + peer->keepalive_timeout = timeout;
>> + if (timeout) {
>
> pedantic nit: inconsistent style with the "interval > 0" test just
> above
ACK, will make them uniform.
>
>> + delta = msecs_to_jiffies(timeout * MSEC_PER_SEC);
>> + mod_timer(&peer->keepalive_recv, jiffies + delta);
>> + } else {
>> + timer_delete(&peer->keepalive_recv);
>> + }
>> +}
>> +
>
> [...]
>> +/**
>> + * ovpn_peer_keepalive_recv_reset - reset keepalive timeout
>> + * @peer: peer for which the timeout should be reset
>> + *
>> + * To be invoked upon reception of an authenticated packet from peer in order
>> + * to report valid activity and thus reset the keepalive timeout
>> + */
>> +static inline void ovpn_peer_keepalive_recv_reset(struct ovpn_peer *peer)
>> +{
>> + u32 delta = msecs_to_jiffies(peer->keepalive_timeout * MSEC_PER_SEC);
>> +
>> + if (unlikely(!delta))
>> + return;
>> +
>> + mod_timer(&peer->keepalive_recv, jiffies + delta);
>
> This (and ovpn_peer_keepalive_xmit_reset) is going to be called for
> each packet. I wonder how well the timer subsystem deals with one
> timer getting updated possibly thousands of time per second.
>
May it even introduce some performance penalty?
Maybe we should get rid of the timer object and introduce a periodic
(1s) worker which checks some last_recv timestamp on every known peer?
What do you think?
Thanks!
--
Antonio Quartulli
OpenVPN Inc.
Powered by blists - more mailing lists