netdev - Re: net/sctp: recursive locking in sctp_do

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACT4Y+bCpT5V268gyVXYNmgMDAHy8fJZdTGL73ZG_ERfoaWh1w@mail.gmail.com>
Date:   Wed, 15 Mar 2017 11:14:56 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
        Vladislav Yasevich <vyasevich@...il.com>,
        Neil Horman <nhorman@...driver.com>,
        David Miller <davem@...emloft.net>, linux-sctp@...r.kernel.org,
        netdev <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        syzkaller <syzkaller@...glegroups.com>
Subject: Re: net/sctp: recursive locking in sctp_do_peeloff

On Wed, Mar 15, 2017 at 5:52 AM, Cong Wang <xiyou.wangcong@...il.com> wrote:
> On Fri, Mar 10, 2017 at 12:04 PM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
>> On Fri, Mar 10, 2017 at 8:46 PM, Marcelo Ricardo Leitner
>> <marcelo.leitner@...il.com> wrote:
>>> On Fri, Mar 10, 2017 at 4:11 PM, Dmitry Vyukov <dvyukov@...gle.com> wrote:
>>>> Hello,
>>>>
>>>> I've got the following recursive locking report while running
>>>> syzkaller fuzzer on net-next/9c28286b1b4b9bce6e35dd4c8a1265f03802a89a:
>>>>
>>>> [ INFO: possible recursive locking detected ]
>>>> 4.10.0+ #14 Not tainted
>>>> ---------------------------------------------
>>>> syz-executor3/5560 is trying to acquire lock:
>>>>  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff8401ebcd>] lock_sock
>>>> include/net/sock.h:1460 [inline]
>>>>  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff8401ebcd>]
>>>> sctp_close+0xcd/0x9d0 net/sctp/socket.c:1497
>>>>
>>>> but task is already holding lock:
>>>>  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff84038110>] lock_sock
>>>> include/net/sock.h:1460 [inline]
>>>>  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff84038110>]
>>>> sctp_getsockopt+0x450/0x67e0 net/sctp/socket.c:6611
>>>>
>>>> other info that might help us debug this:
>>>>  Possible unsafe locking scenario:
>>>>
>>>>        CPU0
>>>>        ----
>>>>   lock(sk_lock-AF_INET6);
>>>>   lock(sk_lock-AF_INET6);
>>>>
>>>>  *** DEADLOCK ***
>>>>
>>>>  May be due to missing lock nesting notation
>>>
>>> Pretty much the case, I suppose. The lock held by sctp_getsockopt() is
>>> on one socket, while the other lock that sctp_close() is getting later
>>> is on the newly created (which failed) socket during peeloff
>>> operation.
>>
>>
>> Does this mean that never-ever lock 2 sockets at a time except for
>> this case? If so, it probably suggests that this case should not do it
>> either.
>>
>
> Yeah, actually for the error path we don't even need to lock sock
> since it is newly allocated and no one else could see it yet.
>
> Instead of checking for the status of the sock, I believe the following
> one-line fix should do the trick too. Can you give it a try?
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 0f378ea..4de62d4 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -1494,7 +1494,7 @@ static void sctp_close(struct sock *sk, long timeout)
>
>         pr_debug("%s: sk:%p, timeout:%ld\n", __func__, sk, timeout);
>
> -       lock_sock(sk);
> +       lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
>         sk->sk_shutdown = SHUTDOWN_MASK;
>         sk->sk_state = SCTP_SS_CLOSING;


Hi  Cong,

I've applied the patch on bots. But so far it happened only once, so I
am not sure I will be able to give any conclusion.