[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZtmMKDPZzsFdbTpq@hog>
Date: Thu, 5 Sep 2024 12:47:04 +0200
From: Sabrina Dubroca <sd@...asysnail.net>
To: Antonio Quartulli <antonio@...nvpn.net>
Cc: netdev@...r.kernel.org, kuba@...nel.org, pabeni@...hat.com,
ryazanov.s.a@...il.com, edumazet@...gle.com, andrew@...n.ch
Subject: Re: [PATCH net-next v6 15/25] ovpn: implement multi-peer support
2024-09-05, 10:02:58 +0200, Antonio Quartulli wrote:
> On 03/09/2024 16:40, Sabrina Dubroca wrote:
> > 2024-08-27, 14:07:55 +0200, Antonio Quartulli wrote:
> > > static int ovpn_net_init(struct net_device *dev)
> > > {
> > > struct ovpn_struct *ovpn = netdev_priv(dev);
> > > + int i, err = gro_cells_init(&ovpn->gro_cells, dev);
> >
> > I'm not a fan of "hiding" the gro_cells_init call up here. I'd prefer
> > if this was done just before the corresponding "if (err)".
>
> I am all with you, but I remember in the past something complaining about
> "variable declared and then re-assigned right after".
>
> But maybe this is not the case anymore.
If you had something like:
int err;
err = -EINVAL;
sure, it would make sense to combine them.
>
> Will move the initialization down.
Thanks.
> > > +
> > > + spin_lock_init(&ovpn->peers->lock_by_id);
> > > + spin_lock_init(&ovpn->peers->lock_by_vpn_addr);
> > > + spin_lock_init(&ovpn->peers->lock_by_transp_addr);
> >
> > What's the benefit of having 3 separate locks instead of a single lock
> > protecting all the hashtables?
>
> The main reason was to avoid a deadlock - I thought I had added a comment
> about it...
Ok.
I could have missed it, I'm not looking at the comments much now that
I'm familiar with the code.
> The problem was a deadlock between acquiring peer->lock and
> ovpn->peers->lock in float() and in then opposite sequence in peers_free().
> (IIRC this happens due to ovpn_peer_reset_sockaddr() acquiring peer->lock)
I don't see a problem with ovpn_peer_reset_sockaddr, but ovpn_peer_put
can be called with lock_by_id held and then take peer->lock (in
ovpn_peer_release), which would be the opposite order to
ovpn_peer_float if the locks were merged (peer->lock then
lock_by_transp_addr).
This should be solvable with a single lock by delaying the bind
cleanup via call_rcu instead of doing it immediately with
ovpn_peer_release (after that delay, nothing should be using
peer->bind anymore, since we have no reference and no more
rcu_read_lock sections that could have found peer, so we can free
immediately and no need to take peer->lock). And it's I think a bit
more "correct" wrt RCU rules, since at ovpn_peer_put time, even with
refcount=0, we could have a reader still using the peer and deciding
to update its bind (not the case with how ovpn_peer_float is called,
since we have a reference on the peer).
(This could be completely wrong and/or make no sense at all :))
But I'm not going to insist on this, you can keep the separate locks.
> Splitting the larger peers->lock allowed me to avoid this scenario, because
> I don't need to jump through any hoop to coordinate access to different
> hashtables.
>
> >
> > > +
> > > + for (i = 0; i < ARRAY_SIZE(ovpn->peers->by_id); i++) {
> > > + INIT_HLIST_HEAD(&ovpn->peers->by_id[i]);
> > > + INIT_HLIST_HEAD(&ovpn->peers->by_vpn_addr[i]);
> > > + INIT_HLIST_NULLS_HEAD(&ovpn->peers->by_transp_addr[i],
> > > + i);
> > > + }
> > > + }
> > > +
> > > + return 0;
> > > }
> >
> > > +static int ovpn_peer_add_mp(struct ovpn_struct *ovpn, struct ovpn_peer *peer)
> > > +{
> > > + struct sockaddr_storage sa = { 0 };
> > > + struct hlist_nulls_head *nhead;
> > > + struct sockaddr_in6 *sa6;
> > > + struct sockaddr_in *sa4;
> > > + struct hlist_head *head;
> > > + struct ovpn_bind *bind;
> > > + struct ovpn_peer *tmp;
> > > + size_t salen;
> > > +
> > > + spin_lock_bh(&ovpn->peers->lock_by_id);
> > > + /* do not add duplicates */
> > > + tmp = ovpn_peer_get_by_id(ovpn, peer->id);
> > > + if (tmp) {
> > > + ovpn_peer_put(tmp);
> > > + spin_unlock_bh(&ovpn->peers->lock_by_id);
> > > + return -EEXIST;
> > > + }
> > > +
> > > + hlist_add_head_rcu(&peer->hash_entry_id,
> > > + ovpn_get_hash_head(ovpn->peers->by_id, &peer->id,
> > > + sizeof(peer->id)));
> > > + spin_unlock_bh(&ovpn->peers->lock_by_id);
> > > +
> > > + bind = rcu_dereference_protected(peer->bind, true);
> > > + /* peers connected via TCP have bind == NULL */
> > > + if (bind) {
> > > + switch (bind->remote.in4.sin_family) {
> > > + case AF_INET:
> > > + sa4 = (struct sockaddr_in *)&sa;
> > > +
> > > + sa4->sin_family = AF_INET;
> > > + sa4->sin_addr.s_addr = bind->remote.in4.sin_addr.s_addr;
> > > + sa4->sin_port = bind->remote.in4.sin_port;
> > > + salen = sizeof(*sa4);
> > > + break;
> > > + case AF_INET6:
> > > + sa6 = (struct sockaddr_in6 *)&sa;
> > > +
> > > + sa6->sin6_family = AF_INET6;
> > > + sa6->sin6_addr = bind->remote.in6.sin6_addr;
> > > + sa6->sin6_port = bind->remote.in6.sin6_port;
> > > + salen = sizeof(*sa6);
> > > + break;
> > > + default:
> >
> > And remove from the by_id hashtable? Or is that handled somewhere that
> > I missed (I don't think ovpn_peer_unhash gets called in that case)?
>
> No we don't call unhash in this case as we assume the adding just failed
> entirely.
>
> I will add the removal before returning the error (moving the add below the
> switch would extend the locked area too much.)
I don't think setting a few variables would be too much to do under
the lock (and it would address the issues in my 2nd reply to this
patch).
--
Sabrina
Powered by blists - more mailing lists