lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <60cae774-b60b-4a4b-8645-91eb6f186032@openvpn.net>
Date: Wed, 8 May 2024 22:31:51 +0200
From: Antonio Quartulli <antonio@...nvpn.net>
To: Sabrina Dubroca <sd@...asysnail.net>
Cc: netdev@...r.kernel.org, Jakub Kicinski <kuba@...nel.org>,
 Sergey Ryazanov <ryazanov.s.a@...il.com>, Paolo Abeni <pabeni@...hat.com>,
 Eric Dumazet <edumazet@...gle.com>, Andrew Lunn <andrew@...n.ch>,
 Esben Haabendal <esben@...nix.com>
Subject: Re: [PATCH net-next v3 07/24] ovpn: introduce the ovpn_peer object

On 08/05/2024 18:06, Sabrina Dubroca wrote:
> 2024-05-06, 03:16:20 +0200, Antonio Quartulli wrote:
>> An ovpn_peer object holds the whole status of a remote peer
>> (regardless whether it is a server or a client).
>>
>> This includes status for crypto, tx/rx buffers, napi, etc.
>>
>> Only support for one peer is introduced (P2P mode).
>> Multi peer support is introduced with a later patch.
>>
>> Along with the ovpn_peer, also the ovpn_bind object is introcued
>                                                           ^
> typo: "introduced"

thanks

> 
>> as the two are strictly related.
>> An ovpn_bind object wraps a sockaddr representing the local
>> coordinates being used to talk to a specific peer.
> 
>> diff --git a/drivers/net/ovpn/bind.c b/drivers/net/ovpn/bind.c
>> new file mode 100644
>> index 000000000000..c1f842c06e32
>> --- /dev/null
>> +++ b/drivers/net/ovpn/bind.c
>> +static void ovpn_bind_release_rcu(struct rcu_head *head)
>> +{
>> +	struct ovpn_bind *bind = container_of(head, struct ovpn_bind, rcu);
>> +
>> +	kfree(bind);
>> +}
>> +
>> +void ovpn_bind_reset(struct ovpn_peer *peer, struct ovpn_bind *new)
>> +{
>> +	struct ovpn_bind *old;
>> +
>> +	spin_lock_bh(&peer->lock);
>> +	old = rcu_replace_pointer(peer->bind, new, true);
>> +	spin_unlock_bh(&peer->lock);
>> +
>> +	if (old)
>> +		call_rcu(&old->rcu, ovpn_bind_release_rcu);
> 
> Isn't that just kfree_rcu? (note kfree_rcu doesn't need the NULL check)

yeah, you're right. I think ovpn_bind_release_rcu() was more complex in 
the past, but got reduced step by step...will directly use kfree_rcu().

> 
>> +}
> 
> 
>> diff --git a/drivers/net/ovpn/bind.h b/drivers/net/ovpn/bind.h
>> new file mode 100644
>> index 000000000000..61433550a961
>> --- /dev/null
>> +++ b/drivers/net/ovpn/bind.h
> [...]
>> +static inline bool ovpn_bind_skb_src_match(const struct ovpn_bind *bind,
>> +					   struct sk_buff *skb)
> 
> nit: I think skb can also be const here

right

> 
> 
>> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
>> index 338e99dfe886..a420bb45f25f 100644
>> --- a/drivers/net/ovpn/io.c
>> +++ b/drivers/net/ovpn/io.c
>> @@ -13,6 +13,7 @@
>>   #include "io.h"
>>   #include "ovpnstruct.h"
>>   #include "netlink.h"
>> +#include "peer.h"
>>   
>>   int ovpn_struct_init(struct net_device *dev)
>>   {
>> @@ -25,6 +26,13 @@ int ovpn_struct_init(struct net_device *dev)
>>   	if (err < 0)
>>   		return err;
>>   
>> +	spin_lock_init(&ovpn->lock);
>> +
>> +	ovpn->events_wq = alloc_workqueue("ovpn-events-wq-%s", WQ_MEM_RECLAIM,
>> +					  0, dev->name);
> 
> I'm not convinced this will get freed consistently if
> register_netdevice fails early (before ndo_init).  After talking to
> Paolo, it seems this should be moved into a new ->ndo_init instead.

oh good point. I didn't consider that register_netdevice could fail that 
early.

> 
>> +	if (!ovpn->events_wq)
>> +		return -ENOMEM;
>> +
>>   	dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
>>   	if (!dev->tstats)
>>   		return -ENOMEM;
>> diff --git a/drivers/net/ovpn/main.c b/drivers/net/ovpn/main.c
>> index cc8a97a1a189..dba35ecb236b 100644
>> --- a/drivers/net/ovpn/main.c
>> +++ b/drivers/net/ovpn/main.c
>> @@ -37,6 +39,9 @@ static void ovpn_struct_free(struct net_device *net)
>>   	rtnl_unlock();
>>   
>>   	free_percpu(net->tstats);
>> +	flush_workqueue(ovpn->events_wq);
>> +	destroy_workqueue(ovpn->events_wq);
> 
> Is the flush needed? I'm not an expert on workqueues, but from a quick
> look at destroy_workqueue it calls drain_workqueue, which would take
> care of flushing the queue?

you're right. drain_workqueue calls __flush_workqueue as often as needed 
to empty the queue.
Therefore I can get rid of my flush_worqueue invocation.

> 
>> +	rcu_barrier();
>>   }
>>   
> 
> [...]
>> diff --git a/drivers/net/ovpn/ovpnstruct.h b/drivers/net/ovpn/ovpnstruct.h
>> index ee05b8a2c61d..b79d4f0474b0 100644
>> --- a/drivers/net/ovpn/ovpnstruct.h
>> +++ b/drivers/net/ovpn/ovpnstruct.h
>> @@ -17,12 +17,19 @@
>>    * @dev: the actual netdev representing the tunnel
>>    * @registered: whether dev is still registered with netdev or not
>>    * @mode: device operation mode (i.e. p2p, mp, ..)
>> + * @lock: protect this object
>> + * @event_wq: used to schedule generic events that may sleep and that need to be
>> + *            performed outside of softirq context
>> + * @peer: in P2P mode, this is the only remote peer
>>    * @dev_list: entry for the module wide device list
>>    */
>>   struct ovpn_struct {
>>   	struct net_device *dev;
>>   	bool registered;
>>   	enum ovpn_mode mode;
>> +	spinlock_t lock; /* protect writing to the ovpn_struct object */
> 
> nit: the comment isn't really needed since you have kdoc saying the same thing

True, but checkpatch.pl (or some other script?) was still throwing a 
warning, therefore I added this comment to silence it.

> 
>> +	struct workqueue_struct *events_wq;
>> +	struct ovpn_peer __rcu *peer;
>>   	struct list_head dev_list;
>>   };
>>   
>> diff --git a/drivers/net/ovpn/peer.c b/drivers/net/ovpn/peer.c
>> new file mode 100644
>> index 000000000000..2948b7320d47
>> --- /dev/null
>> +++ b/drivers/net/ovpn/peer.c
> [...]
>> +/**
>> + * ovpn_peer_free - release private members and free peer object
>> + * @peer: the peer to free
>> + */
>> +static void ovpn_peer_free(struct ovpn_peer *peer)
>> +{
>> +	ovpn_bind_reset(peer, NULL);
>> +
>> +	WARN_ON(!__ptr_ring_empty(&peer->tx_ring));
> 
> Could you pass a destructor to ptr_ring_cleanup instead of all these WARNs?

hmm but if we remove the WARNs then we lose the possibility to catch 
potential bugs, no? rings should definitely be empty at this point.

Or you think I should just not care and free any potentially remaining item?

> 
>> +	ptr_ring_cleanup(&peer->tx_ring, NULL);
>> +	WARN_ON(!__ptr_ring_empty(&peer->rx_ring));
>> +	ptr_ring_cleanup(&peer->rx_ring, NULL);
>> +	WARN_ON(!__ptr_ring_empty(&peer->netif_rx_ring));
>> +	ptr_ring_cleanup(&peer->netif_rx_ring, NULL);
>> +
>> +	dst_cache_destroy(&peer->dst_cache);
>> +
>> +	dev_put(peer->ovpn->dev);
>> +
>> +	kfree(peer);
>> +}
> 
> [...]
>> +void ovpn_peer_release(struct ovpn_peer *peer)
>> +{
>> +	call_rcu(&peer->rcu, ovpn_peer_release_rcu);
>> +}
>> +
>> +/**
>> + * ovpn_peer_delete_work - work scheduled to release peer in process context
>> + * @work: the work object
>> + */
>> +static void ovpn_peer_delete_work(struct work_struct *work)
>> +{
>> +	struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
>> +					      delete_work);
>> +	ovpn_peer_release(peer);
> 
> Does call_rcu really need to run in process context?

Reason for switching to process context is that we have to invoke 
ovpn_nl_notify_del_peer (that sends a netlink event to userspace) and 
the latter requires a reference to the peer.

For this reason I thought it would be safe to have 
ovpn_nl_notify_del_peer and call_rcu invoked by the same context.

If I invoke call_rcu in ovpn_peer_release_kref, how can I be sure that 
the peer hasn't been free'd already when ovpn_nl_notify_del_peer is 
executed?


> 
>> +}
> 
> [...]
>> +/**
>> + * ovpn_peer_transp_match - check if sockaddr and peer binding match
>> + * @peer: the peer to get the binding from
>> + * @ss: the sockaddr to match
>> + *
>> + * Return: true if sockaddr and binding match or false otherwise
>> + */
>> +static bool ovpn_peer_transp_match(struct ovpn_peer *peer,
>> +				   struct sockaddr_storage *ss)
>> +{
> [...]
>> +	case AF_INET6:
>> +		sa6 = (struct sockaddr_in6 *)ss;
>> +		if (memcmp(&sa6->sin6_addr, &bind->sa.in6.sin6_addr,
>> +			   sizeof(struct in6_addr)))
> 
> ipv6_addr_equal?
> 

definitely. thanks

>> +			return false;
>> +		if (sa6->sin6_port != bind->sa.in6.sin6_port)
>> +			return false;
>> +		break;
> 
> [...]
>> +struct ovpn_peer *ovpn_peer_get_by_id(struct ovpn_struct *ovpn, u32 peer_id)
>> +{
>> +	struct ovpn_peer *peer = NULL;
>> +
>> +	if (ovpn->mode == OVPN_MODE_P2P)
>> +		peer = ovpn_peer_get_by_id_p2p(ovpn, peer_id);
>> +
>> +	return peer;
>> +}
>> +
>> +/**
>> + * ovpn_peer_add_p2p - add per to related tables in a P2P instance
>                                ^
> typo: peer?

yeah

> 
> 
> [...]
>> +/**
>> + * ovpn_peer_del_p2p - delete peer from related tables in a P2P instance
>> + * @peer: the peer to delete
>> + * @reason: reason why the peer was deleted (sent to userspace)
>> + *
>> + * Return: 0 on success or a negative error code otherwise
>> + */
>> +static int ovpn_peer_del_p2p(struct ovpn_peer *peer,
>> +			     enum ovpn_del_peer_reason reason)
>> +{
>> +	struct ovpn_peer *tmp;
>> +	int ret = -ENOENT;
>> +
>> +	spin_lock_bh(&peer->ovpn->lock);
>> +	tmp = rcu_dereference(peer->ovpn->peer);
>> +	if (tmp != peer)
>> +		goto unlock;
> 
> How do we recover if all those objects got out of sync? Are we stuck
> with a broken peer?

mhhh I don't fully get the scenario you are depicting.

In P2P mode there is only peer stored (reference is saved in ovpn->peer)

When we want to get rid of it, we invoke ovpn_peer_del_p2p().
The check we are performing here is just about being sure that we are 
removing the exact peer we requested to remove (and not some other peer 
that was still floating around for some reason).

> 
> And if this happens during interface deletion, aren't we leaking the
> peer memory here?

at interface deletion we call

ovpn_iface_destruct -> ovpn_peer_release_p2p -> 
ovpn_peer_del_p2p(ovpn->peer)

so at the last step we just ask to remove the very same peer that is 
curently stored, which should just never fail.

makes sense?

> 
>> +	ovpn_peer_put(tmp);
>> +	tmp->delete_reason = reason;
>> +	RCU_INIT_POINTER(peer->ovpn->peer, NULL);
>> +	ret = 0;
>> +
>> +unlock:
>> +	spin_unlock_bh(&peer->ovpn->lock);
>> +
>> +	return ret;
>> +}
> 
> [...]
>> diff --git a/drivers/net/ovpn/peer.h b/drivers/net/ovpn/peer.h
>> new file mode 100644
>> index 000000000000..659df320525c
>> --- /dev/null
>> +++ b/drivers/net/ovpn/peer.h
> [...]
>> +/**
>> + * struct ovpn_peer - the main remote peer object
>> + * @ovpn: main openvpn instance this peer belongs to
>> + * @id: unique identifier
>> + * @vpn_addrs.ipv4: IPv4 assigned to peer on the tunnel
>> + * @vpn_addrs.ipv6: IPv6 assigned to peer on the tunnel
>> + * @tx_ring: queue of outgoing poackets to this peer
>> + * @rx_ring: queue of incoming packets from this peer
>> + * @netif_rx_ring: queue of packets to be sent to the netdevice via NAPI
>> + * @dst_cache: cache for dst_entry used to send to peer
>> + * @bind: remote peer binding
>> + * @halt: true if ovpn_peer_mark_delete was called
>> + * @delete_reason: why peer was deleted (i.e. timeout, transport error, ..)
>> + * @lock: protects binding to peer (bind)
>> + * @refcount: reference counter
>> + * @rcu: used to free peer in an RCU safe way
>> + * @delete_work: deferred cleanup work, used to notify userspace
>> + */
>> +struct ovpn_peer {
>> +	struct ovpn_struct *ovpn;
>> +	u32 id;
>> +	struct {
>> +		struct in_addr ipv4;
>> +		struct in6_addr ipv6;
>> +	} vpn_addrs;
>> +	struct ptr_ring tx_ring;
>> +	struct ptr_ring rx_ring;
>> +	struct ptr_ring netif_rx_ring;
>> +	struct dst_cache dst_cache;
>> +	struct ovpn_bind __rcu *bind;
>> +	bool halt;
>> +	enum ovpn_del_peer_reason delete_reason;
>> +	spinlock_t lock; /* protects bind */
> 
> nit: the comment isn't really needed, it's redundant with kdoc.

like before, also here I had a warning which I wanted to silence.

> 
>> +	struct kref refcount;
>> +	struct rcu_head rcu;
>> +	struct work_struct delete_work;
>> +};
> 

Thanks a lot!


-- 
Antonio Quartulli
OpenVPN Inc.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ