lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e0c2b581-87a4-48d2-bede-1eaab2430c7d@linux.alibaba.com>
Date: Tue, 10 Sep 2024 14:58:44 +0800
From: "D. Wythe" <alibuda@...ux.alibaba.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Wenjia Zhang <wenjia@...ux.ibm.com>,
 syzbot <syzbot+51cf7cc5f9ffc1006ef2@...kaller.appspotmail.com>,
 Dust Li <dust.li@...ux.alibaba.com>, davem@...emloft.net, kuba@...nel.org,
 linux-kernel@...r.kernel.org, netdev@...r.kernel.org, pabeni@...hat.com,
 syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [net?] possible deadlock in rtnl_lock (8)



On 9/10/24 2:36 PM, Eric Dumazet wrote:
> On Tue, Sep 10, 2024 at 7:55 AM D. Wythe <alibuda@...ux.alibaba.com> wrote:
>>
>>
>> On 9/9/24 7:44 PM, Wenjia Zhang wrote:
>>>
>>> On 09.09.24 10:02, Eric Dumazet wrote:
>>>> On Sun, Sep 8, 2024 at 10:12 AM syzbot
>>>> <syzbot+51cf7cc5f9ffc1006ef2@...kaller.appspotmail.com> wrote:
>>>>> syzbot has found a reproducer for the following issue on:
>>>>>
>>>>> HEAD commit:    df54f4a16f82 Merge branch 'for-next/core' into
>>>>> for-kernelci
>>>>> git tree:
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
>>>>> for-kernelci
>>>>> console output:
>>>>> https://syzkaller.appspot.com/x/log.txt?x=12bdabc7980000
>>>>> kernel config:
>>>>> https://syzkaller.appspot.com/x/.config?x=dde5a5ba8d41ee9e
>>>>> dashboard link:
>>>>> https://syzkaller.appspot.com/bug?extid=51cf7cc5f9ffc1006ef2
>>>>> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils
>>>>> for Debian) 2.40
>>>>> userspace arch: arm64
>>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1798589f980000
>>>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10a30e00580000
>>>>>
>>>>> Downloadable assets:
>>>>> disk image:
>>>>> https://storage.googleapis.com/syzbot-assets/aa2eb06e0aea/disk-df54f4a1.raw.xz
>>>>> vmlinux:
>>>>> https://storage.googleapis.com/syzbot-assets/14728733d385/vmlinux-df54f4a1.xz
>>>>> kernel image:
>>>>> https://storage.googleapis.com/syzbot-assets/99816271407d/Image-df54f4a1.gz.xz
>>>>>
>>>>> IMPORTANT: if you fix the issue, please add the following tag to the
>>>>> commit:
>>>>> Reported-by: syzbot+51cf7cc5f9ffc1006ef2@...kaller.appspotmail.com
>>>>>
>>>>> ======================================================
>>>>> WARNING: possible circular locking dependency detected
>>>>> 6.11.0-rc5-syzkaller-gdf54f4a16f82 #0 Not tainted
>>>>> ------------------------------------------------------
>>>>> syz-executor272/6388 is trying to acquire lock:
>>>>> ffff8000923b6ce8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x20/0x2c
>>>>> net/core/rtnetlink.c:79
>>>>>
>>>>> but task is already holding lock:
>>>>> ffff0000dc408a50 (&smc->clcsock_release_lock){+.+.}-{3:3}, at:
>>>>> smc_setsockopt+0x178/0x10fc net/smc/af_smc.c:3064
>>>>>
>>>>> which lock already depends on the new lock.
>>>>>
>> I have noticed this issue for a while, but I question the possibility of
>> it. If I understand correctly, a deadlock issue following is reported here:
>>
>> #2
>> lock_sock_smc
>> {
>>       clcsock_release_lock            --- deadlock
>>       {
>>
>>       }
>> }
>>
>> #1
>> rtnl_mutex
>> {
>>       lock_sock_smc
>>       {
>>
>>       }
>> }
>>
>> #0
>> clcsock_release_lock
>> {
>>       rtnl_mutex                      --deadlock
>>       {
>>
>>       }
>> }
>>
>> This is of course a deadlock, but #1 is suspicious.
>>
>> How would this happen to a smc sock?
>>
>> #1 ->
>>          lock_sock_nested+0x38/0xe8 net/core/sock.c:3543
>>          lock_sock include/net/sock.h:1607 [inline]
>>          sockopt_lock_sock net/core/sock.c:1061 [inline]
>>          sockopt_lock_sock+0x58/0x74 net/core/sock.c:1052
>>          do_ip_setsockopt+0xe0/0x2358 net/ipv4/ip_sockglue.c:1078
>>          ip_setsockopt+0x34/0x9c net/ipv4/ip_sockglue.c:1417
>>          raw_setsockopt+0x7c/0x2e0 net/ipv4/raw.c:845
>>          sock_common_setsockopt+0x70/0xe0 net/core/sock.c:3735
>>          do_sock_setsockopt+0x17c/0x354 net/socket.c:2324
>>
>> As a comparison, the correct calling chain should be:
>>
>>          sock_common_setsockopt+0x70/0xe0 net/core/sock.c:3735
>>          smc_setsockopt+0x150/0xcec net/smc/af_smc.c:3072
>>          do_sock_setsockopt+0x17c/0x354 net/socket.c:2324
>>
>>
>> That's to say,  any setting on SOL_IP options of smc_sock will
>> go with smc_setsockopt, which will try lock clcsock_release_lock at first.
>>
>> Anyway, if anyone can explain #1, then we can see how to solve this problem,
>> otherwise I think this problem doesn't exist. (Just my opinion)
> Then SMC lacks some lockdep annotations.
>
> Please take a look at sock_lock_init_class_and_name() callers.

It seems so, which also explains why it wasn't reported with AF_SMC sock.
I'll try to fix it ASAP.

D. Wythe


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ