[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YTMxnfkJVA8b6lAV@Laptop-X1>
Date: Sat, 4 Sep 2021 16:43:09 +0800
From: Hangbin Liu <liuhangbin@...il.com>
To: "Jason A. Donenfeld" <Jason@...c4.com>
Cc: netdev@...r.kernel.org, David Miller <davem@...emloft.net>,
Xiumei Mu <xmu@...hat.com>,
Toke Høiland-Jørgensen <toke@...hat.com>,
Paolo Abeni <pabeni@...hat.com>, wireguard@...ts.zx2c4.com
Subject: Re: [PATCH net] wireguard: remove peer cache in netns_pre_exit
On Wed, Sep 01, 2021 at 03:55:43PM +0200, Jason A. Donenfeld wrote:
> Hi Hangbin,
>
> Thanks for the patch and especially for the test. While I see that
> you've pointed to a real problem, I don't think that this particular way
> of fixing it is correct, because it will cause issues for userspace that
> expects to be able to read back the list of peers for, for example,
> keeping track of the latest endpoint addresses or rx/tx transfer
> quantities.
>
> I think the real solution here is to simply clear the endpoint src cache
> and consequently the dst_cache. This is slightly complicated by the fact
> that dst_cache releases dsts lazily, so I needed to add a little utility
> function for that, but that was pretty easy to do.
>
> Can you take a look at the below patch and let me know if it works for
> you and passes other testing you and Toke might be doing with it? (Also,
> please CC the wireguard mailing list in addition to netdev next time?)
> If the patch looks good to you and works well, I'll include it in the
> next series of wireguard patches I send back out to netdev. I'm back
> from travels next week and will begin working on the next series then.
Hi Jason,
I tested your patch on both physical and virtual machines. All works fine.
So please feel free to add
Tested-by: Hangbin Liu <liuhangbin@...il.com>
Thanks
Hangbin
>
> Regards,
> Jason
>
> ---------8<-------------8<-----------------
>
> From f9984a41eeaebfdcef5aba8a71966b77ba0de8c0 Mon Sep 17 00:00:00 2001
> From: "Jason A. Donenfeld" <Jason@...c4.com>
> Date: Wed, 1 Sep 2021 14:53:39 +0200
> Subject: [PATCH] wireguard: device: reset peer src endpoint when netns exits
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Each peer's endpoint contains a dst_cache entry that takes a reference
> to another netdev. When the containing namespace exits, we take down the
> socket and prevent future sockets from being created (by setting
> creating_net to NULL), which removes that potential reference on the
> netns. However, it doesn't release references to the netns that a netdev
> cached in dst_cache might be taking, so the netns still might fail to
> exit. Since the socket is gimped anyway, we can simply clear all the
> dst_caches (by way of clearing the endpoint src), which will release all
> references.
>
> However, the current dst_cache_reset function only releases those
> references lazily. But it turns out that all of our usages of
> wg_socket_clear_peer_endpoint_src are called from contexts that are not
> exactly high-speed or bottle-necked. For example, when there's
> connection difficulty, or when userspace is reconfiguring the interface.
> And in particular for this patch, when the netns is exiting. So for
> those cases, it makes more sense to call dst_release immediately. For
> that, we add a small helper function to dst_cache.
>
> This patch also adds a test to netns.sh from Hangbin Liu to ensure this
> doesn't regress.
>
> Test-by: Hangbin Liu <liuhangbin@...il.com>
> Reported-by: Xiumei Mu <xmu@...hat.com>
> Cc: Hangbin Liu <liuhangbin@...il.com>
> Cc: Toke Høiland-Jørgensen <toke@...hat.com>
> Cc: Paolo Abeni <pabeni@...hat.com>
> Fixes: 900575aa33a3 ("wireguard: device: avoid circular netns references")
> Signed-off-by: Jason A. Donenfeld <Jason@...c4.com>
> ---
> drivers/net/wireguard/device.c | 3 +++
> drivers/net/wireguard/socket.c | 2 +-
> include/net/dst_cache.h | 11 ++++++++++
> net/core/dst_cache.c | 19 +++++++++++++++++
> tools/testing/selftests/wireguard/netns.sh | 24 +++++++++++++++++++++-
> 5 files changed, 57 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c
> index 551ddaaaf540..77e64ea6be67 100644
> --- a/drivers/net/wireguard/device.c
> +++ b/drivers/net/wireguard/device.c
> @@ -398,6 +398,7 @@ static struct rtnl_link_ops link_ops __read_mostly = {
> static void wg_netns_pre_exit(struct net *net)
> {
> struct wg_device *wg;
> + struct wg_peer *peer;
>
> rtnl_lock();
> list_for_each_entry(wg, &device_list, device_list) {
> @@ -407,6 +408,8 @@ static void wg_netns_pre_exit(struct net *net)
> mutex_lock(&wg->device_update_lock);
> rcu_assign_pointer(wg->creating_net, NULL);
> wg_socket_reinit(wg, NULL, NULL);
> + list_for_each_entry(peer, &wg->peer_list, peer_list)
> + wg_socket_clear_peer_endpoint_src(peer);
> mutex_unlock(&wg->device_update_lock);
> }
> }
> diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c
> index 8c496b747108..6f07b949cb81 100644
> --- a/drivers/net/wireguard/socket.c
> +++ b/drivers/net/wireguard/socket.c
> @@ -308,7 +308,7 @@ void wg_socket_clear_peer_endpoint_src(struct wg_peer *peer)
> {
> write_lock_bh(&peer->endpoint_lock);
> memset(&peer->endpoint.src6, 0, sizeof(peer->endpoint.src6));
> - dst_cache_reset(&peer->endpoint_cache);
> + dst_cache_reset_now(&peer->endpoint_cache);
> write_unlock_bh(&peer->endpoint_lock);
> }
>
> diff --git a/include/net/dst_cache.h b/include/net/dst_cache.h
> index 67634675e919..df6622a5fe98 100644
> --- a/include/net/dst_cache.h
> +++ b/include/net/dst_cache.h
> @@ -79,6 +79,17 @@ static inline void dst_cache_reset(struct dst_cache *dst_cache)
> dst_cache->reset_ts = jiffies;
> }
>
> +/**
> + * dst_cache_reset_now - invalidate the cache contents immediately
> + * @dst_cache: the cache
> + *
> + * The caller must be sure there are no concurrent users, as this frees
> + * all dst_cache users immediately, rather than waiting for the next
> + * per-cpu usage like dst_cache_reset does. Most callers should use the
> + * higher speed lazily-freed dst_cache_reset function instead.
> + */
> +void dst_cache_reset_now(struct dst_cache *dst_cache);
> +
> /**
> * dst_cache_init - initialize the cache, allocating the required storage
> * @dst_cache: the cache
> diff --git a/net/core/dst_cache.c b/net/core/dst_cache.c
> index be74ab4551c2..0ccfd5fa5cb9 100644
> --- a/net/core/dst_cache.c
> +++ b/net/core/dst_cache.c
> @@ -162,3 +162,22 @@ void dst_cache_destroy(struct dst_cache *dst_cache)
> free_percpu(dst_cache->cache);
> }
> EXPORT_SYMBOL_GPL(dst_cache_destroy);
> +
> +void dst_cache_reset_now(struct dst_cache *dst_cache)
> +{
> + int i;
> +
> + if (!dst_cache->cache)
> + return;
> +
> + dst_cache->reset_ts = jiffies;
> + for_each_possible_cpu(i) {
> + struct dst_cache_pcpu *idst = per_cpu_ptr(dst_cache->cache, i);
> + struct dst_entry *dst = idst->dst;
> +
> + idst->cookie = 0;
> + idst->dst = NULL;
> + dst_release(dst);
> + }
> +}
> +EXPORT_SYMBOL_GPL(dst_cache_reset_now);
> diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh
> index 2e5c1630885e..8a9461aa0878 100755
> --- a/tools/testing/selftests/wireguard/netns.sh
> +++ b/tools/testing/selftests/wireguard/netns.sh
> @@ -613,6 +613,28 @@ ip0 link set wg0 up
> kill $ncat_pid
> ip0 link del wg0
>
> +# Ensure that dst_cache references don't outlive netns lifetime
> +ip1 link add dev wg0 type wireguard
> +ip2 link add dev wg0 type wireguard
> +configure_peers
> +ip1 link add veth1 type veth peer name veth2
> +ip1 link set veth2 netns $netns2
> +ip1 addr add fd00:aa::1/64 dev veth1
> +ip2 addr add fd00:aa::2/64 dev veth2
> +ip1 link set veth1 up
> +ip2 link set veth2 up
> +waitiface $netns1 veth1
> +waitiface $netns2 veth2
> +ip1 -6 route add default dev veth1 via fd00:aa::2
> +ip2 -6 route add default dev veth2 via fd00:aa::1
> +n1 wg set wg0 peer "$pub2" endpoint [fd00:aa::2]:2
> +n2 wg set wg0 peer "$pub1" endpoint [fd00:aa::1]:1
> +n1 ping6 -c 1 fd00::2
> +pp ip netns delete $netns1
> +pp ip netns delete $netns2
> +pp ip netns add $netns1
> +pp ip netns add $netns2
> +
> # Ensure there aren't circular reference loops
> ip1 link add wg1 type wireguard
> ip2 link add wg2 type wireguard
> @@ -631,7 +653,7 @@ while read -t 0.1 -r line 2>/dev/null || [[ $? -ne 142 ]]; do
> done < /dev/kmsg
> alldeleted=1
> for object in "${!objects[@]}"; do
> - if [[ ${objects["$object"]} != *createddestroyed ]]; then
> + if [[ ${objects["$object"]} != *createddestroyed && ${objects["$object"]} != *createdcreateddestroyeddestroyed ]]; then
> echo "Error: $object: merely ${objects["$object"]}" >&3
> alldeleted=0
> fi
> --
> 2.32.0
Powered by blists - more mailing lists