netdev - Re: possible deadlock in perf_event_detach_bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAH3MdRXb15zcioWt77nhaX18woD_8sqOMr+tpbt2z7t4xR-jJA@mail.gmail.com>
Date:   Sun, 8 Apr 2018 08:56:01 -0700
From:   Y Song <ys114321@...il.com>
To:     Daniel Borkmann <daniel@...earbox.net>
Cc:     syzbot <syzbot+dc5ca0e4c9bfafaf2bae@...kaller.appspotmail.com>,
        Alexei Starovoitov <ast@...nel.org>,
        linux-kernel@...r.kernel.org, mingo@...hat.com,
        netdev <netdev@...r.kernel.org>, rostedt@...dmis.org,
        syzkaller-bugs@...glegroups.com
Subject: Re: possible deadlock in perf_event_detach_bpf_prog

On Thu, Mar 29, 2018 at 2:18 PM, Daniel Borkmann <daniel@...earbox.net> wrote:
> On 03/29/2018 11:04 PM, syzbot wrote:
>> Hello,
>>
>> syzbot hit the following crash on upstream commit
>> 3eb2ce825ea1ad89d20f7a3b5780df850e4be274 (Sun Mar 25 22:44:30 2018 +0000)
>> Linux 4.16-rc7
>> syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=dc5ca0e4c9bfafaf2bae
>>
>> Unfortunately, I don't have any reproducer for this crash yet.
>> Raw console output: https://syzkaller.appspot.com/x/log.txt?id=4742532743299072
>> Kernel config: https://syzkaller.appspot.com/x/.config?id=-8440362230543204781
>> compiler: gcc (GCC) 7.1.1 20170620
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+dc5ca0e4c9bfafaf2bae@...kaller.appspotmail.com
>> It will help syzbot understand when the bug is fixed. See footer for details.
>> If you forward the report, please keep this part and the footer.
>>
>>
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 4.16.0-rc7+ #3 Not tainted
>> ------------------------------------------------------
>> syz-executor7/24531 is trying to acquire lock:
>>  (bpf_event_mutex){+.+.}, at: [<000000008a849b07>] perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854
>>
>> but task is already holding lock:
>>  (&mm->mmap_sem){++++}, at: [<0000000038768f87>] vm_mmap_pgoff+0x198/0x280 mm/util.c:353
>>
>> which lock already depends on the new lock.
>>
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (&mm->mmap_sem){++++}:
>>        __might_fault+0x13a/0x1d0 mm/memory.c:4571
>>        _copy_to_user+0x2c/0xc0 lib/usercopy.c:25
>>        copy_to_user include/linux/uaccess.h:155 [inline]
>>        bpf_prog_array_copy_info+0xf2/0x1c0 kernel/bpf/core.c:1694
>>        perf_event_query_prog_array+0x1c7/0x2c0 kernel/trace/bpf_trace.c:891
>
> Looks like we should move the two copy_to_user() outside of
> bpf_event_mutex section to avoid the deadlock.

This is introduced by one of my previous patches. The above suggested fix
makes sense. I will craft a patch and send to the mailing list for bpf branch
soon.

>
>>        _perf_ioctl kernel/events/core.c:4750 [inline]
>>        perf_ioctl+0x3e1/0x1480 kernel/events/core.c:4770
>>        vfs_ioctl fs/ioctl.c:46 [inline]
>>        do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
>>        SYSC_ioctl fs/ioctl.c:701 [inline]
>>        SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
>>        do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>>        entry_SYSCALL_64_after_hwframe+0x42/0xb7
>>
>> -> #0 (bpf_event_mutex){+.+.}:
>>        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
>>        __mutex_lock_common kernel/locking/mutex.c:756 [inline]
>>        __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
>>        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>>        perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854
>>        perf_event_free_bpf_prog kernel/events/core.c:8147 [inline]
>>        _free_event+0xbdb/0x10f0 kernel/events/core.c:4116
>>        put_event+0x24/0x30 kernel/events/core.c:4204
>>        perf_mmap_close+0x60d/0x1010 kernel/events/core.c:5172
>>        remove_vma+0xb4/0x1b0 mm/mmap.c:172
>>        remove_vma_list mm/mmap.c:2490 [inline]
>>        do_munmap+0x82a/0xdf0 mm/mmap.c:2731
>>        mmap_region+0x59e/0x15a0 mm/mmap.c:1646
>>        do_mmap+0x6c0/0xe00 mm/mmap.c:1483
>>        do_mmap_pgoff include/linux/mm.h:2223 [inline]
>>        vm_mmap_pgoff+0x1de/0x280 mm/util.c:355
>>        SYSC_mmap_pgoff mm/mmap.c:1533 [inline]
>>        SyS_mmap_pgoff+0x462/0x5f0 mm/mmap.c:1491
>>        SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
>>        SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91
>>        do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>>        entry_SYSCALL_64_after_hwframe+0x42/0xb7
>>
>> other info that might help us debug this:
>>
>>  Possible unsafe locking scenario:
>>
>>        CPU0                    CPU1
>>        ----                    ----
>>   lock(&mm->mmap_sem);
>>                                lock(bpf_event_mutex);
>>                                lock(&mm->mmap_sem);
>>   lock(bpf_event_mutex);
>>
>>  *** DEADLOCK ***
>>
>> 1 lock held by syz-executor7/24531:
>>  #0:  (&mm->mmap_sem){++++}, at: [<0000000038768f87>] vm_mmap_pgoff+0x198/0x280 mm/util.c:353
>>
>> stack backtrace:
>> CPU: 0 PID: 24531 Comm: syz-executor7 Not tainted 4.16.0-rc7+ #3
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:17 [inline]
>>  dump_stack+0x194/0x24d lib/dump_stack.c:53
>>  print_circular_bug.isra.38+0x2cd/0x2dc kernel/locking/lockdep.c:1223
>>  check_prev_add kernel/locking/lockdep.c:1863 [inline]
>>  check_prevs_add kernel/locking/lockdep.c:1976 [inline]
>>  validate_chain kernel/locking/lockdep.c:2417 [inline]
>>  __lock_acquire+0x30a8/0x3e00 kernel/locking/lockdep.c:3431
>>  lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
>>  __mutex_lock_common kernel/locking/mutex.c:756 [inline]
>>  __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
>>  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
>>  perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854
>>  perf_event_free_bpf_prog kernel/events/core.c:8147 [inline]
>>  _free_event+0xbdb/0x10f0 kernel/events/core.c:4116
>>  put_event+0x24/0x30 kernel/events/core.c:4204
>>  perf_mmap_close+0x60d/0x1010 kernel/events/core.c:5172
>>  remove_vma+0xb4/0x1b0 mm/mmap.c:172
>>  remove_vma_list mm/mmap.c:2490 [inline]
>>  do_munmap+0x82a/0xdf0 mm/mmap.c:2731
>>  mmap_region+0x59e/0x15a0 mm/mmap.c:1646
>>  do_mmap+0x6c0/0xe00 mm/mmap.c:1483
>>  do_mmap_pgoff include/linux/mm.h:2223 [inline]
>>  vm_mmap_pgoff+0x1de/0x280 mm/util.c:355
>>  SYSC_mmap_pgoff mm/mmap.c:1533 [inline]
>>  SyS_mmap_pgoff+0x462/0x5f0 mm/mmap.c:1491
>>  SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
>>  SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91
>>  do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
>>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
>> RIP: 0033:0x454889
>> RSP: 002b:00007f5f44fdac68 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
>> RAX: ffffffffffffffda RBX: 00007f5f44fdb6d4 RCX: 0000000000454889
>> RDX: 0000000000000000 RSI: 0000000000002000 RDI: 0000000020f1f000
>> RBP: 000000000072c010 R08: 0000000000000014 R09: 0000000000000000
>> R10: 0000000000000011 R11: 0000000000000246 R12: 00000000ffffffff
>> R13: 00000000000003f4 R14: 00000000006f7f80 R15: 0000000000000002
>> bond0 (unregistering): Released all slaves
>> IPVS: ftp: loaded support on port[0] = 21
>> IPv6: ADDRCONF(NETDEV_UP): bridge0: link is not ready
>> IPv6: ADDRCONF(NETDEV_UP): bond0: link is not ready
>> 8021q: adding VLAN 0 to HW filter on device bond0
>> IPv6: ADDRCONF(NETDEV_UP): veth0: link is not ready
>> IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
>> kernel msg: ebtables bug: please report to author: Wrong len argument
>> kernel msg: ebtables bug: please report to author: Wrong len argument
>> kernel msg: ebtables bug: please report to author: Wrong len argument
>> kernel msg: ebtables bug: please report to author: Wrong len argument
>> kernel msg: ebtables bug: please report to author: Wrong len argument
>> kernel msg: ebtables bug: please report to author: Wrong len argument
>> kernel msg: ebtables bug: please report to author: Wrong len argument
>>
>>
>> ---
>> This bug is generated by a dumb bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for details.
>> Direct all questions to syzkaller@...glegroups.com.
>>
>> syzbot will keep track of this bug report.
>> If you forgot to add the Reported-by tag, once the fix for this bug is merged
>> into any tree, please reply to this email with:
>> #syz fix: exact-commit-title
>> To mark this as a duplicate of another syzbot report, please reply with:
>> #syz dup: exact-subject-of-another-report
>> If it's a one-off invalid bug report, please reply with:
>> #syz invalid
>> Note: if the crash happens again, it will cause creation of a new bug report.
>> Note: all commands must start from beginning of the line in the email body.
>