[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9ccc0635-7c0e-4a18-8469-9c5b6d9b268f@linux.dev>
Date: Fri, 19 Dec 2025 19:51:37 -0800
From: Zhu Yanjun <yanjun.zhu@...ux.dev>
To: Stefan Metzmacher <metze@...ba.org>, linux-rdma@...r.kernel.org
Cc: Zhu Yanjun <zyjzyj2000@...il.com>, Jason Gunthorpe <jgg@...pe.ca>,
Leon Romanovsky <leon@...nel.org>,
Shinichiro Kawasaki <shinichiro.kawasaki@....com>, netdev@...r.kernel.org,
linux-cifs@...r.kernel.org
Subject: Re: [PATCH] RDMA/rxe: let rxe_reclassify_recv_socket() call
sk_owner_put()
在 2025/12/19 6:04, Stefan Metzmacher 写道:
> On kernels build with CONFIG_PROVE_LOCKING, CONFIG_MODULES
> and CONFIG_DEBUG_LOCK_ALLOC 'rmmod rdma_rxe' is no longer
> possible.
>
> For the global recv sockets rxe_net_exit() is where we
> call rxe_release_udp_tunnel-> udp_tunnel_sock_release(),
> which means the sockets are destroyed before 'rmmod rdma_rxe'
> finishes, so there's no need to protect against
> rxe_recv_slock_key and rxe_recv_sk_key disappearing
> while the sockets are still alive.
>
> Fixes: 80a85a771deb ("RDMA/rxe: reclassify sockets in order to avoid false positives from lockdep")
> Cc: Zhu Yanjun <zyjzyj2000@...il.com>
> Cc: Jason Gunthorpe <jgg@...pe.ca>
> Cc: Leon Romanovsky <leon@...nel.org>
> Cc: Shinichiro Kawasaki <shinichiro.kawasaki@....com>
> Cc: linux-rdma@...r.kernel.org
> Cc: netdev@...r.kernel.org
> Cc: linux-cifs@...r.kernel.org
> Signed-off-by: Stefan Metzmacher <metze@...ba.org>
Thanks a lot. IIRC, there is a similar commit for SIW driver. Thus, I am
not sure if there is a similar problem in SIW driver or not.
Reviewed-by: Zhu Yanjun <yanjun.zhu@...ux.dev>
Zhu Yanjun
> ---
> drivers/infiniband/sw/rxe/rxe_net.c | 32 +++++++++++++++++++++++++++++
> 1 file changed, 32 insertions(+)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
> index 0195d361e5e3..0bd0902b11f7 100644
> --- a/drivers/infiniband/sw/rxe/rxe_net.c
> +++ b/drivers/infiniband/sw/rxe/rxe_net.c
> @@ -64,7 +64,39 @@ static inline void rxe_reclassify_recv_socket(struct socket *sock)
> break;
> default:
> WARN_ON_ONCE(1);
> + return;
> }
> + /*
> + * sock_lock_init_class_and_name() calls
> + * sk_owner_set(sk, THIS_MODULE); in order
> + * to make sure the referenced global
> + * variables rxe_recv_slock_key and
> + * rxe_recv_sk_key are not removed
> + * before the socket is closed.
> + *
> + * However this prevents rxe_net_exit()
> + * from being called and 'rmmod rdma_rxe'
> + * is refused because of the references.
> + *
> + * For the global sockets in recv_sockets,
> + * we are sure that rxe_net_exit() will call
> + * rxe_release_udp_tunnel -> udp_tunnel_sock_release.
> + *
> + * So we don't need the additional reference to
> + * our own (THIS_MODULE).
> + */
> + sk_owner_put(sk);
> + /*
> + * We also call sk_owner_clear() otherwise
> + * sk_owner_put(sk) in sk_prot_free will
> + * fail, which is called via
> + * sk_free -> __sk_free -> sk_destruct
> + * and sk_destruct calls __sk_destruct
> + * directly or via call_rcu()
> + * so sk_prot_free() might be called
> + * after rxe_net_exit().
> + */
> + sk_owner_clear(sk);
> #endif /* CONFIG_DEBUG_LOCK_ALLOC */
> }
>
Powered by blists - more mailing lists