netdev - Re: [PATCH] net: rds: use maybe_get_net() when acquiring refcount on TCP sockets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iJk5Pc4kcamjkjLF1xcNkdKHh+HmcRcXnVcaU0cXd9Cfw@mail.gmail.com>
Date:   Wed, 4 May 2022 17:53:42 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     Paolo Abeni <pabeni@...hat.com>,
        patchwork-bot+netdevbpf@...nel.org,
        Santosh Shilimkar <santosh.shilimkar@...cle.com>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        syzbot <syzbot+694120e1002c117747ed@...kaller.appspotmail.com>,
        netdev <netdev@...r.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        linux-rdma <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH] net: rds: use maybe_get_net() when acquiring refcount on
 TCP sockets

On Wed, May 4, 2022 at 5:45 PM Tetsuo Handa
<penguin-kernel@...ove.sakura.ne.jp> wrote:
>
> Eric Dumazet is reporting addition on 0 problem at rds_tcp_tune(), for
> delayed works queued in rds_wq might be invoked after a net namespace's
> refcount already reached 0.
>
> Since rds_tcp_exit_net() from cleanup_net() calls flush_workqueue(rds_wq),
> it is guaranteed that we can instead use maybe_get_net() from delayed work
> functions until rds_tcp_exit_net() returns.
>
> Note that I'm not convinced that all works which might access a net
> namespace are already queued in rds_wq by the moment rds_tcp_exit_net()
> calls flush_workqueue(rds_wq). If some race is there, rds_tcp_exit_net()
> will fail to wait for work functions, and kmem_cache_free() could be
> called from net_free() before maybe_get_net() is called from
> rds_tcp_tune().
>
> Reported-by: Eric Dumazet <edumazet@...gle.com>
> Fixes: 3a58f13a881ed351 ("net: rds: acquire refcount on TCP sockets")
> Signed-off-by: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
> ---
>  net/rds/tcp.c         | 11 ++++++++---
>  net/rds/tcp.h         |  2 +-
>  net/rds/tcp_connect.c |  5 ++++-
>  net/rds/tcp_listen.c  |  5 ++++-
>  4 files changed, 17 insertions(+), 6 deletions(-)
>
> diff --git a/net/rds/tcp.c b/net/rds/tcp.c
> index 2f638f8b7b1e..8e26bcf02044 100644
> --- a/net/rds/tcp.c
> +++ b/net/rds/tcp.c
> @@ -487,11 +487,11 @@ struct rds_tcp_net {
>  /* All module specific customizations to the RDS-TCP socket should be done in
>   * rds_tcp_tune() and applied after socket creation.
>   */
> -void rds_tcp_tune(struct socket *sock)
> +bool rds_tcp_tune(struct socket *sock)
>  {
>         struct sock *sk = sock->sk;
>         struct net *net = sock_net(sk);
> -       struct rds_tcp_net *rtn = net_generic(net, rds_tcp_netid);
> +       struct rds_tcp_net *rtn;
>
>         tcp_sock_set_nodelay(sock->sk);
>         lock_sock(sk);
> @@ -499,10 +499,14 @@ void rds_tcp_tune(struct socket *sock)
>          * a process which created this net namespace terminated.
>          */
>         if (!sk->sk_net_refcnt) {
> +               if (!maybe_get_net(net)) {


> +                       release_sock(sk);
> +                       return false;
> +               }
>                 sk->sk_net_refcnt = 1;
> -               get_net_track(net, &sk->ns_tracker, GFP_KERNEL);

This could use:
                  netns_tracker_alloc(net, &sk->ns_tracker, GFP_KERNEL);

>                 sock_inuse_add(net, 1);
>         }
> +       rtn = net_generic(net, rds_tcp_netid);
>         if (rtn->sndbuf_size > 0) {
>                 sk->sk_sndbuf = rtn->sndbuf_size;
>                 sk->sk_userlocks |= SOCK_SNDBUF_LOCK;
> @@ -512,6 +516,7 @@ void rds_tcp_tune(struct socket *sock)
>                 sk->sk_userlocks |= SOCK_RCVBUF_LOCK;
>         }
>         release_sock(sk);
> +       return true;
>  }
>

Otherwise, patch looks good to me, thanks.