[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1d0c553c-18bb-4d0d-8358-eff0b65c6c56@davidwei.uk>
Date: Sun, 5 May 2024 19:27:29 -0700
From: David Wei <dw@...idwei.uk>
To: Wen Gu <guwen@...ux.alibaba.com>, wenjia@...ux.ibm.com,
jaka@...ux.ibm.com, davem@...emloft.net, edumazet@...gle.com,
kuba@...nel.org, pabeni@...hat.com
Cc: alibuda@...ux.alibaba.com, tonylu@...ux.alibaba.com,
linux-s390@...r.kernel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH net] net/smc: fix netdev refcnt leak in
smc_ib_find_route()
On 2024-05-05 18:54, Wen Gu wrote:
> A netdev refcnt leak issue was found when unregistering netdev after
> using SMC. It can be reproduced as follows.
>
> - run tests based on SMC.
> - unregister the net device.
>
> The following error message can be observed.
>
> 'unregister_netdevice: waiting for ethx to become free. Usage count = x'
>
> With CONFIG_NET_DEV_REFCNT_TRACKER set, more detailed error message can
> be provided by refcount tracker:
>
> unregister_netdevice: waiting for eth1 to become free. Usage count = 2
> ref_tracker: eth%d@...f9cabc3bf8548 has 1/1 users at
> ___neigh_create+0x8e/0x420
> neigh_event_ns+0x52/0xc0
> arp_process+0x7c0/0x860
> __netif_receive_skb_list_core+0x258/0x2c0
> __netif_receive_skb_list+0xea/0x150
> netif_receive_skb_list_internal+0xf2/0x1b0
> napi_complete_done+0x73/0x1b0
> mlx5e_napi_poll+0x161/0x5e0 [mlx5_core]
> __napi_poll+0x2c/0x1c0
> net_rx_action+0x2a7/0x380
> __do_softirq+0xcd/0x2a7
>
> It is because in smc_ib_find_route(), neigh_lookup() takes a netdev
> refcnt but does not release. So fix it.
>
> Fixes: e5c4744cfb59 ("net/smc: add SMC-Rv2 connection establishment")
> Signed-off-by: Wen Gu <guwen@...ux.alibaba.com>
> ---
> net/smc/smc_ib.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index 97704a9e84c7..b431bd8a5172 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -210,10 +210,11 @@ int smc_ib_find_route(struct net *net, __be32 saddr, __be32 daddr,
> goto out;
> if (rt->rt_uses_gateway && rt->rt_gw_family != AF_INET)
> goto out;
> - neigh = rt->dst.ops->neigh_lookup(&rt->dst, NULL, &fl4.daddr);
> + neigh = dst_neigh_lookup(&rt->dst, &fl4.daddr);
Of the two implementations of neigh_lookup() I found that do not simply
return NULL, all of them increment or init struct neighbour::refcnt.
1. ipv4_neigh_lookup()
2. ip6_dst_neigh_lookup()
a. __ipv6_neigh_lookup()
b. neigh_create()
> if (neigh) {
> memcpy(nexthop_mac, neigh->ha, ETH_ALEN);
> *uses_gateway = rt->rt_uses_gateway;
> + neigh_release(neigh);
So releasing it here looks correct.
> return 0;
> }
> out:
Powered by blists - more mailing lists