lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 9 Apr 2022 10:55:40 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     bpf <bpf@...r.kernel.org>,
        syzbot <syzbot+694120e1002c117747ed@...kaller.appspotmail.com>,
        Andrii Nakryiko <andrii@...nel.org>,
        Andrii Nakryiko <andriin@...com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        David Miller <davem@...emloft.net>,
        David Ahern <dsahern@...nel.org>,
        John Fastabend <john.fastabend@...il.com>,
        Martin KaFai Lau <kafai@...com>,
        KP Singh <kpsingh@...nel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        Alexey Kuznetsov <kuznet@....inr.ac.ru>,
        netdev <netdev@...r.kernel.org>,
        Song Liu <songliubraving@...com>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        tpa@...hospital.com, Yonghong Song <yhs@...com>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Trond Myklebust <trondmy@...merspace.com>
Subject: Re: [syzbot] KASAN: use-after-free Read in tcp_retransmit_timer (5)

On Sat, Apr 9, 2022 at 10:47 AM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Sat, Apr 9, 2022 at 9:46 AM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Sat, Apr 9, 2022 at 1:19 AM Tetsuo Handa
> > <penguin-kernel@...ove.sakura.ne.jp> wrote:
> > >
> > > Hello, bpf developers.
> > >
> > > syzbot is reporting use-after-free increment at __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPTIMEOUTS).
> >
> >
> > Try removing NFS from your kernel .config ? If your repro still works,
> > then another user of kernel TCP socket needs some care.
> >
> > NFS maintainers and other folks are already working on fixing this issue,
> > which is partly caused by fs/file_table.c being able to delay fput(),
> > look at code in fput_many()
> >
> > Kernel TCP sockets are tricky, they (for good reasons) do not take a
> > reference on the net namespace.
> >
> > This also means that users of such sockets need to make sure the
> > various tcp timers have been completed,
> > as sk_stop_timer() is not using del_timer_sync()
> >
> > Even after a synchronous fput(), there is no guarantee that another
> > cpu is not running some of the socket timers functions.
>
> So please add to your tree the NFS fix:
>
> commit f00432063db1a0db484e85193eccc6845435b80e
> Author: Trond Myklebust <trond.myklebust@...merspace.com>
> Date:   Sun Apr 3 15:58:11 2022 -0400
>
>     SUNRPC: Ensure we flush any closed sockets before xs_xprt_free()
>
>     We must ensure that all sockets are closed before we call xprt_free()
>     and release the reference to the net namespace. The problem is that
>     calling fput() will defer closing the socket until delayed_fput() gets
>     called.
>     Let's fix the situation by allowing rpciod and the transport teardown
>     code (which runs on the system wq) to call __fput_sync(), and directly
>     close the socket.
>
>     Reported-by: Felix Fu <foyjog@...il.com>
>     Acked-by: Al Viro <viro@...iv.linux.org.uk>
>     Fixes: a73881c96d73 ("SUNRPC: Fix an Oops in udp_poll()")
>     Cc: stable@...r.kernel.org # 5.1.x: 3be232f11a3c: SUNRPC: Prevent
> immediate close+reconnect
>     Cc: stable@...r.kernel.org # 5.1.x: 89f42494f92f: SUNRPC: Don't
> call connect() more than once on a TCP socket
>     Cc: stable@...r.kernel.org # 5.1.x
>     Signed-off-by: Trond Myklebust <trond.myklebust@...merspace.com>
>
> Then on top of that, add the following fix (I will formally submit
> this one once back to work, Monday morning)
>
> diff --git a/include/net/inet_connection_sock.h
> b/include/net/inet_connection_sock.h
> index 3908296d103fd2de9284adea64dba94fe6b8720f..e2c856ae4fdbef5bd3c7728e376786b804e2d4f1
> 100644
> --- a/include/net/inet_connection_sock.h
> +++ b/include/net/inet_connection_sock.h
> @@ -171,6 +171,7 @@ void inet_csk_init_xmit_timers(struct sock *sk,
>                                void (*delack_handler)(struct timer_list *),
>                                void (*keepalive_handler)(struct timer_list *));
>  void inet_csk_clear_xmit_timers(struct sock *sk);
> +void inet_csk_clear_xmit_timers_sync(struct sock *sk);
>
>  static inline void inet_csk_schedule_ack(struct sock *sk)
>  {
> diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> index 1e5b53c2bb2670fc90b789e853458f5c86a00c27..aab83b766014d0a091a73bdc13376d9cdae99b27
> 100644
> --- a/net/ipv4/inet_connection_sock.c
> +++ b/net/ipv4/inet_connection_sock.c
> @@ -581,6 +581,17 @@ void inet_csk_clear_xmit_timers(struct sock *sk)
>  }
>  EXPORT_SYMBOL(inet_csk_clear_xmit_timers);
>
> +void inet_csk_clear_xmit_timers_sync(struct sock *sk)
> +{
> +       struct inet_connection_sock *icsk = inet_csk(sk);
> +
> +       icsk->icsk_pending = icsk->icsk_ack.pending = 0;
> +
> +       sk_stop_timer_sync(sk, &icsk->icsk_retransmit_timer);
> +       sk_stop_timer_sync(sk, &icsk->icsk_delack_timer);
> +       sk_stop_timer_sync(sk, &sk->sk_timer);
> +}
> +
>  void inet_csk_delete_keepalive_timer(struct sock *sk)
>  {
>         sk_stop_timer(sk, &sk->sk_timer);
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index e31cf137c6140f76f838b4a0dcddf9f104ad653b..3dacd202bf2af43c55ffe820c08316150d2018ea
> 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2928,6 +2928,8 @@ void tcp_close(struct sock *sk, long timeout)
>         lock_sock(sk);
>         __tcp_close(sk, timeout);
>         release_sock(sk);
> +       if (!sk->sk_net_refcnt)
> +               inet_csk_clear_xmit_timers_sync(sk);
>         sock_put(sk);
>  }
>  EXPORT_SYMBOL(tcp_close);

Side note: We will probably be able to revert this patch, that perhaps
was working around the real issue.

commit 4ee806d51176ba7b8ff1efd81f271d7252e03a1d
Author: Dan Streetman <ddstreet@...e.org>
Date:   Thu Jan 18 16:14:26 2018 -0500

    net: tcp: close sock if net namespace is exiting

    When a tcp socket is closed, if it detects that its net namespace is
    exiting, close immediately and do not wait for FIN sequence.

    For normal sockets, a reference is taken to their net namespace, so it will
    never exit while the socket is open.  However, kernel sockets do not take a
    reference to their net namespace, so it may begin exiting while the kernel
    socket is still open.  In this case if the kernel socket is a tcp socket,
    it will stay open trying to complete its close sequence.  The sock's dst(s)
    hold a reference to their interface, which are all transferred to the
    namespace's loopback interface when the real interfaces are taken down.
    When the namespace tries to take down its loopback interface, it hangs
    waiting for all references to the loopback interface to release, which
    results in messages like:

    unregister_netdevice: waiting for lo to become free. Usage count = 1

    These messages continue until the socket finally times out and closes.
    Since the net namespace cleanup holds the net_mutex while calling its
    registered pernet callbacks, any new net namespace initialization is
    blocked until the current net namespace finishes exiting.

    After this change, the tcp socket notices the exiting net namespace, and
    closes immediately, releasing its dst(s) and their reference to the
    loopback interface, which lets the net namespace continue exiting.

    Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
    Signed-off-by: Dan Streetman <ddstreet@...onical.com>
    Signed-off-by: David S. Miller <davem@...emloft.net>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ