[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <daf370fa-0509-4e49-a2dc-90632d429112@openvpn.net>
Date: Thu, 18 Jul 2024 15:21:07 +0200
From: Antonio Quartulli <antonio@...nvpn.net>
To: Sabrina Dubroca <sd@...asysnail.net>
Cc: netdev@...r.kernel.org, kuba@...nel.org, ryazanov.s.a@...il.com,
pabeni@...hat.com, edumazet@...gle.com, andrew@...n.ch
Subject: Re: [PATCH net-next v5 19/25] ovpn: add support for peer floating
On 18/07/2024 13:12, Sabrina Dubroca wrote:
>> How do I test running encap_rcv in parallel?
>> This is actually an interesting case that I thought to not be possible (no
>> specific reason for this..).
>
> It should happen when the packets come from different source IPs and
> the NIC has multiple queues, then they can be spread over different
> CPUs. But it's probably not going to be easy to land multiple packets
> in ovpn_peer_float at the same time to trigger this issue.
I see. Yeah, this is not easy.
>
>
>>>> + netdev_dbg(peer->ovpn->dev, "%s: peer %d floated to %pIScp", __func__,
>>>> + peer->id, &ss);
>>>> + ovpn_peer_reset_sockaddr(peer, (struct sockaddr_storage *)&ss,
>>>> + local_ip);
>>>> +
>>>> + spin_lock_bh(&peer->ovpn->peers->lock);
>>>> + /* remove old hashing */
>>>> + hlist_del_init_rcu(&peer->hash_entry_transp_addr);
>>>> + /* re-add with new transport address */
>>>> + hlist_add_head_rcu(&peer->hash_entry_transp_addr,
>>>> + ovpn_get_hash_head(peer->ovpn->peers->by_transp_addr,
>>>> + &ss, salen));
>>>
>>> That could send a concurrent reader onto the wrong hash bucket, if
>>> it's going through peer's old bucket, finds peer before the update,
>>> then continues reading after peer is moved to the new bucket.
>>
>> I haven't fully grasped this scenario.
>> I am imagining we are running ovpn_peer_get_by_transp_addr() in parallel:
>> reader gets the old bucket and finds peer, because ovpn_peer_transp_match()
>> will still return true (update wasn't performed yet), and will return it.
>
> The other reader isn't necessarily looking for peer, but maybe another
> item that landed in the same bucket (though your hashtables are so
> large, it would be a bit unlucky).
>
>> At this point, what do you mean with "continues reading after peer is moved
>> to the new bucket"?
>
> Continues iterating, in hlist_for_each_entry_rcu inside
> ovpn_peer_get_by_transp_addr.
>
> ovpn_peer_float ovpn_peer_get_by_transp_addr
>
> start lookup
> head = ovpn_get_hash_head(...)
> hlist_for_each_entry_rcu
> ...
> find peer on head
>
> peer moved from head to head2
>
> continue hlist_for_each_entry_rcu with peer->next
> but peer->next is now on head2
> keep walking ->next on head2 instead of head
>
Ok got it.
Basically we might move the reader from a list to another without it
noticing.
Will have a look at the pointer provided by Paolo and modify this code
accordingly.
Thanks!
--
Antonio Quartulli
OpenVPN Inc.
Powered by blists - more mailing lists