[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iJk5Pc4kcamjkjLF1xcNkdKHh+HmcRcXnVcaU0cXd9Cfw@mail.gmail.com>
Date: Wed, 4 May 2022 17:53:42 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc: Paolo Abeni <pabeni@...hat.com>,
patchwork-bot+netdevbpf@...nel.org,
Santosh Shilimkar <santosh.shilimkar@...cle.com>,
David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
syzbot <syzbot+694120e1002c117747ed@...kaller.appspotmail.com>,
netdev <netdev@...r.kernel.org>,
syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
linux-rdma <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH] net: rds: use maybe_get_net() when acquiring refcount on
TCP sockets
On Wed, May 4, 2022 at 5:45 PM Tetsuo Handa
<penguin-kernel@...ove.sakura.ne.jp> wrote:
>
> Eric Dumazet is reporting addition on 0 problem at rds_tcp_tune(), for
> delayed works queued in rds_wq might be invoked after a net namespace's
> refcount already reached 0.
>
> Since rds_tcp_exit_net() from cleanup_net() calls flush_workqueue(rds_wq),
> it is guaranteed that we can instead use maybe_get_net() from delayed work
> functions until rds_tcp_exit_net() returns.
>
> Note that I'm not convinced that all works which might access a net
> namespace are already queued in rds_wq by the moment rds_tcp_exit_net()
> calls flush_workqueue(rds_wq). If some race is there, rds_tcp_exit_net()
> will fail to wait for work functions, and kmem_cache_free() could be
> called from net_free() before maybe_get_net() is called from
> rds_tcp_tune().
>
> Reported-by: Eric Dumazet <edumazet@...gle.com>
> Fixes: 3a58f13a881ed351 ("net: rds: acquire refcount on TCP sockets")
> Signed-off-by: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
> ---
> net/rds/tcp.c | 11 ++++++++---
> net/rds/tcp.h | 2 +-
> net/rds/tcp_connect.c | 5 ++++-
> net/rds/tcp_listen.c | 5 ++++-
> 4 files changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/net/rds/tcp.c b/net/rds/tcp.c
> index 2f638f8b7b1e..8e26bcf02044 100644
> --- a/net/rds/tcp.c
> +++ b/net/rds/tcp.c
> @@ -487,11 +487,11 @@ struct rds_tcp_net {
> /* All module specific customizations to the RDS-TCP socket should be done in
> * rds_tcp_tune() and applied after socket creation.
> */
> -void rds_tcp_tune(struct socket *sock)
> +bool rds_tcp_tune(struct socket *sock)
> {
> struct sock *sk = sock->sk;
> struct net *net = sock_net(sk);
> - struct rds_tcp_net *rtn = net_generic(net, rds_tcp_netid);
> + struct rds_tcp_net *rtn;
>
> tcp_sock_set_nodelay(sock->sk);
> lock_sock(sk);
> @@ -499,10 +499,14 @@ void rds_tcp_tune(struct socket *sock)
> * a process which created this net namespace terminated.
> */
> if (!sk->sk_net_refcnt) {
> + if (!maybe_get_net(net)) {
> + release_sock(sk);
> + return false;
> + }
> sk->sk_net_refcnt = 1;
> - get_net_track(net, &sk->ns_tracker, GFP_KERNEL);
This could use:
netns_tracker_alloc(net, &sk->ns_tracker, GFP_KERNEL);
> sock_inuse_add(net, 1);
> }
> + rtn = net_generic(net, rds_tcp_netid);
> if (rtn->sndbuf_size > 0) {
> sk->sk_sndbuf = rtn->sndbuf_size;
> sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
> @@ -512,6 +516,7 @@ void rds_tcp_tune(struct socket *sock)
> sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
> }
> release_sock(sk);
> + return true;
> }
>
Otherwise, patch looks good to me, thanks.
Powered by blists - more mailing lists