[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5d49ef6c-ad35-4199-b5af-0caae5a04e85@openvpn.net>
Date: Thu, 18 Jul 2024 11:37:38 +0200
From: Antonio Quartulli <antonio@...nvpn.net>
To: Sabrina Dubroca <sd@...asysnail.net>
Cc: netdev@...r.kernel.org, kuba@...nel.org, ryazanov.s.a@...il.com,
pabeni@...hat.com, edumazet@...gle.com, andrew@...n.ch
Subject: Re: [PATCH net-next v5 19/25] ovpn: add support for peer floating
On 17/07/2024 19:15, Sabrina Dubroca wrote:
> 2024-06-27, 15:08:37 +0200, Antonio Quartulli wrote:
>> +void ovpn_peer_float(struct ovpn_peer *peer, struct sk_buff *skb)
>> +{
>> + struct sockaddr_storage ss;
>> + const u8 *local_ip = NULL;
>> + struct sockaddr_in6 *sa6;
>> + struct sockaddr_in *sa;
>> + struct ovpn_bind *bind;
>> + sa_family_t family;
>> + size_t salen;
>> +
>> + rcu_read_lock();
>> + bind = rcu_dereference(peer->bind);
>> + if (unlikely(!bind))
>> + goto unlock;
>
> Why are you aborting here? ovpn_bind_skb_src_match considers
> bind==NULL to be "no match" (reasonable), then we would create a new
> bind for the current address.
(NOTE: float and the following explanation assume connection via UDP)
peer->bind is assigned right after peer creation in ovpn_nl_set_peer_doit().
ovpn_peer_float() is called while the peer is exchanging traffic.
If we got to this point and bind is NULL, then the peer was being
released, because there is no way we are going to NULLify bind during
the peer life cycle, except upon ovpn_peer_release().
Does it make sense?
>
>> +
>> + if (likely(ovpn_bind_skb_src_match(bind, skb)))
>
> This could be running in parallel on two CPUs, because ->encap_rcv
> isn't protected against that. So the bind could be getting updated in
> parallel. I would move spin_lock_bh above this check to make sure it
> doesn't happen.
hm, I should actually use peer->lock for this, which is currently only
used in ovpn_bind_reset() to avoid multiple concurrent assignments...but
you're right we should include the call to skb_src_check() as well.
>
> ovpn_peer_update_local_endpoint would also need something like that, I
> think.
at least the test-and-set part should be protected, if we can truly
invoke ovpn_peer_update_local_endpoint() multiple times concurrently.
How do I test running encap_rcv in parallel?
This is actually an interesting case that I thought to not be possible
(no specific reason for this..).
>
>> + goto unlock;
>> +
>> + family = skb_protocol_to_family(skb);
>> +
>> + if (bind->sa.in4.sin_family == family)
>> + local_ip = (u8 *)&bind->local;
>> +
>> + switch (family) {
>> + case AF_INET:
>> + sa = (struct sockaddr_in *)&ss;
>> + sa->sin_family = AF_INET;
>> + sa->sin_addr.s_addr = ip_hdr(skb)->saddr;
>> + sa->sin_port = udp_hdr(skb)->source;
>> + salen = sizeof(*sa);
>> + break;
>> + case AF_INET6:
>> + sa6 = (struct sockaddr_in6 *)&ss;
>> + sa6->sin6_family = AF_INET6;
>> + sa6->sin6_addr = ipv6_hdr(skb)->saddr;
>> + sa6->sin6_port = udp_hdr(skb)->source;
>> + sa6->sin6_scope_id = ipv6_iface_scope_id(&ipv6_hdr(skb)->saddr,
>> + skb->skb_iif);
>> + salen = sizeof(*sa6);
>> + break;
>> + default:
>> + goto unlock;
>> + }
>> +
>> + netdev_dbg(peer->ovpn->dev, "%s: peer %d floated to %pIScp", __func__,
>> + peer->id, &ss);
>> + ovpn_peer_reset_sockaddr(peer, (struct sockaddr_storage *)&ss,
>> + local_ip);
>> +
>> + spin_lock_bh(&peer->ovpn->peers->lock);
>> + /* remove old hashing */
>> + hlist_del_init_rcu(&peer->hash_entry_transp_addr);
>> + /* re-add with new transport address */
>> + hlist_add_head_rcu(&peer->hash_entry_transp_addr,
>> + ovpn_get_hash_head(peer->ovpn->peers->by_transp_addr,
>> + &ss, salen));
>
> That could send a concurrent reader onto the wrong hash bucket, if
> it's going through peer's old bucket, finds peer before the update,
> then continues reading after peer is moved to the new bucket.
I haven't fully grasped this scenario.
I am imagining we are running ovpn_peer_get_by_transp_addr() in
parallel: reader gets the old bucket and finds peer, because
ovpn_peer_transp_match() will still return true (update wasn't performed
yet), and will return it.
At this point, what do you mean with "continues reading after peer is
moved to the new bucket"?
>
> This kind of re-hash can be handled with nulls, and re-trying the
> lookup if we ended up on the wrong chain. See for example
> __inet_lookup_established in net/ipv4/inet_hashtables.c (Thanks to
> Paolo for the pointer).
>
>> + spin_unlock_bh(&peer->ovpn->peers->lock);
>> +
>> +unlock:
>> + rcu_read_unlock();
>> +}
>
--
Antonio Quartulli
OpenVPN Inc.
Powered by blists - more mailing lists