netdev - Re: [PATCH net-next v6 15/25] ovpn: implement multi-peer support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZtmMKDPZzsFdbTpq@hog>
Date: Thu, 5 Sep 2024 12:47:04 +0200
From: Sabrina Dubroca <sd@...asysnail.net>
To: Antonio Quartulli <antonio@...nvpn.net>
Cc: netdev@...r.kernel.org, kuba@...nel.org, pabeni@...hat.com,
	ryazanov.s.a@...il.com, edumazet@...gle.com, andrew@...n.ch
Subject: Re: [PATCH net-next v6 15/25] ovpn: implement multi-peer support

2024-09-05, 10:02:58 +0200, Antonio Quartulli wrote:
> On 03/09/2024 16:40, Sabrina Dubroca wrote:
> > 2024-08-27, 14:07:55 +0200, Antonio Quartulli wrote:
> > >   static int ovpn_net_init(struct net_device *dev)
> > >   {
> > >   	struct ovpn_struct *ovpn = netdev_priv(dev);
> > > +	int i, err = gro_cells_init(&ovpn->gro_cells, dev);
> > 
> > I'm not a fan of "hiding" the gro_cells_init call up here. I'd prefer
> > if this was done just before the corresponding "if (err)".
> 
> I am all with you, but I remember in the past something complaining about
> "variable declared and then re-assigned right after".
> 
> But maybe this is not the case anymore.

If you had something like:

	int err;

	err = -EINVAL;

sure, it would make sense to combine them.

> 
> Will move the initialization down.

Thanks.


> > > +
> > > +		spin_lock_init(&ovpn->peers->lock_by_id);
> > > +		spin_lock_init(&ovpn->peers->lock_by_vpn_addr);
> > > +		spin_lock_init(&ovpn->peers->lock_by_transp_addr);
> > 
> > What's the benefit of having 3 separate locks instead of a single lock
> > protecting all the hashtables?
> 
> The main reason was to avoid a deadlock - I thought I had added a comment
> about it...

Ok.

I could have missed it, I'm not looking at the comments much now that
I'm familiar with the code.

> The problem was a deadlock between acquiring peer->lock and
> ovpn->peers->lock in float() and in then opposite sequence in peers_free().
> (IIRC this happens due to ovpn_peer_reset_sockaddr() acquiring peer->lock)

I don't see a problem with ovpn_peer_reset_sockaddr, but ovpn_peer_put
can be called with lock_by_id held and then take peer->lock (in
ovpn_peer_release), which would be the opposite order to
ovpn_peer_float if the locks were merged (peer->lock then
lock_by_transp_addr).

This should be solvable with a single lock by delaying the bind
cleanup via call_rcu instead of doing it immediately with
ovpn_peer_release (after that delay, nothing should be using
peer->bind anymore, since we have no reference and no more
rcu_read_lock sections that could have found peer, so we can free
immediately and no need to take peer->lock). And it's I think a bit
more "correct" wrt RCU rules, since at ovpn_peer_put time, even with
refcount=0, we could have a reader still using the peer and deciding
to update its bind (not the case with how ovpn_peer_float is called,
since we have a reference on the peer).

(This could be completely wrong and/or make no sense at all :))

But I'm not going to insist on this, you can keep the separate locks.


> Splitting the larger peers->lock allowed me to avoid this scenario, because
> I don't need to jump through any hoop to coordinate access to different
> hashtables.
> 
> > 
> > > +
> > > +		for (i = 0; i < ARRAY_SIZE(ovpn->peers->by_id); i++) {
> > > +			INIT_HLIST_HEAD(&ovpn->peers->by_id[i]);
> > > +			INIT_HLIST_HEAD(&ovpn->peers->by_vpn_addr[i]);
> > > +			INIT_HLIST_NULLS_HEAD(&ovpn->peers->by_transp_addr[i],
> > > +					      i);
> > > +		}
> > > +	}
> > > +
> > > +	return 0;
> > >   }
> > 
> > > +static int ovpn_peer_add_mp(struct ovpn_struct *ovpn, struct ovpn_peer *peer)
> > > +{
> > > +	struct sockaddr_storage sa = { 0 };
> > > +	struct hlist_nulls_head *nhead;
> > > +	struct sockaddr_in6 *sa6;
> > > +	struct sockaddr_in *sa4;
> > > +	struct hlist_head *head;
> > > +	struct ovpn_bind *bind;
> > > +	struct ovpn_peer *tmp;
> > > +	size_t salen;
> > > +
> > > +	spin_lock_bh(&ovpn->peers->lock_by_id);
> > > +	/* do not add duplicates */
> > > +	tmp = ovpn_peer_get_by_id(ovpn, peer->id);
> > > +	if (tmp) {
> > > +		ovpn_peer_put(tmp);
> > > +		spin_unlock_bh(&ovpn->peers->lock_by_id);
> > > +		return -EEXIST;
> > > +	}
> > > +
> > > +	hlist_add_head_rcu(&peer->hash_entry_id,
> > > +			   ovpn_get_hash_head(ovpn->peers->by_id, &peer->id,
> > > +					      sizeof(peer->id)));
> > > +	spin_unlock_bh(&ovpn->peers->lock_by_id);
> > > +
> > > +	bind = rcu_dereference_protected(peer->bind, true);
> > > +	/* peers connected via TCP have bind == NULL */
> > > +	if (bind) {
> > > +		switch (bind->remote.in4.sin_family) {
> > > +		case AF_INET:
> > > +			sa4 = (struct sockaddr_in *)&sa;
> > > +
> > > +			sa4->sin_family = AF_INET;
> > > +			sa4->sin_addr.s_addr = bind->remote.in4.sin_addr.s_addr;
> > > +			sa4->sin_port = bind->remote.in4.sin_port;
> > > +			salen = sizeof(*sa4);
> > > +			break;
> > > +		case AF_INET6:
> > > +			sa6 = (struct sockaddr_in6 *)&sa;
> > > +
> > > +			sa6->sin6_family = AF_INET6;
> > > +			sa6->sin6_addr = bind->remote.in6.sin6_addr;
> > > +			sa6->sin6_port = bind->remote.in6.sin6_port;
> > > +			salen = sizeof(*sa6);
> > > +			break;
> > > +		default:
> > 
> > And remove from the by_id hashtable? Or is that handled somewhere that
> > I missed (I don't think ovpn_peer_unhash gets called in that case)?
> 
> No we don't call unhash in this case as we assume the adding just failed
> entirely.
> 
> I will add the removal before returning the error (moving the add below the
> switch would extend the locked area too much.)

I don't think setting a few variables would be too much to do under
the lock (and it would address the issues in my 2nd reply to this
patch).

-- 
Sabrina