linux-kernel - Re: possible deadlock in proc_pid

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mu2fj7xu.fsf@x220.int.ebiederm.org>
Date:   Fri, 28 Aug 2020 07:01:17 -0500
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     syzbot <syzbot+db9cdf3dd1f64252c6ef@...kaller.appspotmail.com>
Cc:     adobriyan@...il.com, akpm@...ux-foundation.org, avagin@...il.com,
        christian@...uner.io, gladkov.alexey@...il.com,
        keescook@...omium.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com,
        walken@...gle.com, Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Miklos Szeredi <miklos@...redi.hu>
Subject: Re: possible deadlock in proc_pid_syscall (2)

syzbot <syzbot+db9cdf3dd1f64252c6ef@...kaller.appspotmail.com> writes:

> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:    15bc20c6 Merge tag 'tty-5.9-rc3' of git://git.kernel.org/p..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15349f96900000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=978db74cb30aa994
> dashboard link: https://syzkaller.appspot.com/bug?extid=db9cdf3dd1f64252c6ef
> compiler:       gcc (GCC) 10.1.0-syz 20200507
>
> Unfortunately, I don't have any reproducer for this issue yet.

Ok.

So if I read this set of traces correctly:

- perf_event_open holds exec_update_mutex

- perf_event_open can call kern_path which for overlayfs takes ovl_i_mutex

- chown on overlayfs calls mnt_want_write which takes sb_writes

- sendfile/splice can read from a seq_file (which takes p->lock)
  while holding mnt_want_write  aka sb_writes of the target file. 

- There are proc files that are seq_files that first take p->lock
  then take exec_update_mutex

So this looks real, if painful.

I have added some likely looking overlayfs and perf people to look at
this.

This feels like an issue where perf can just do too much under
exec_update_mutex.  In particular calling kern_path from
create_local_trace_uprobe.  Calling into the vfs at the very least
makes it impossible to know exactly which locks will be taken.

Thoughts?

Eric

> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+db9cdf3dd1f64252c6ef@...kaller.appspotmail.com
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.9.0-rc2-syzkaller #0 Not tainted
> ------------------------------------------------------
> syz-executor.0/18445 is trying to acquire lock:
> ffff88809f2e0dc8 (&sig->exec_update_mutex){+.+.}-{3:3}, at: lock_trace fs/proc/base.c:408 [inline]
> ffff88809f2e0dc8 (&sig->exec_update_mutex){+.+.}-{3:3}, at: proc_pid_syscall+0xaa/0x2b0 fs/proc/base.c:646
>
> but task is already holding lock:
> ffff88808e9a3c30 (&p->lock){+.+.}-{3:3}, at: seq_read+0x61/0x1070 fs/seq_file.c:155
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #3 (&p->lock){+.+.}-{3:3}:
>        __mutex_lock_common kernel/locking/mutex.c:956 [inline]
>        __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
>        seq_read+0x61/0x1070 fs/seq_file.c:155
>        pde_read fs/proc/inode.c:306 [inline]
>        proc_reg_read+0x221/0x300 fs/proc/inode.c:318
>        do_loop_readv_writev fs/read_write.c:734 [inline]
>        do_loop_readv_writev fs/read_write.c:721 [inline]
>        do_iter_read+0x48e/0x6e0 fs/read_write.c:955
>        vfs_readv+0xe5/0x150 fs/read_write.c:1073
>        kernel_readv fs/splice.c:355 [inline]
>        default_file_splice_read.constprop.0+0x4e6/0x9e0 fs/splice.c:412
>        do_splice_to+0x137/0x170 fs/splice.c:871
>        splice_direct_to_actor+0x307/0x980 fs/splice.c:950
>        do_splice_direct+0x1b3/0x280 fs/splice.c:1059
>        do_sendfile+0x55f/0xd40 fs/read_write.c:1540
>        __do_sys_sendfile64 fs/read_write.c:1601 [inline]
>        __se_sys_sendfile64 fs/read_write.c:1587 [inline]
>        __x64_sys_sendfile64+0x1cc/0x210 fs/read_write.c:1587
>        do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>        entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> -> #2 (sb_writers#4){.+.+}-{0:0}:
>        percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
>        __sb_start_write+0x234/0x470 fs/super.c:1672
>        sb_start_write include/linux/fs.h:1643 [inline]
>        mnt_want_write+0x3a/0xb0 fs/namespace.c:354
>        ovl_setattr+0x5c/0x850 fs/overlayfs/inode.c:28
>        notify_change+0xb60/0x10a0 fs/attr.c:336
>        chown_common+0x4a9/0x550 fs/open.c:674
>        do_fchownat+0x126/0x1e0 fs/open.c:704
>        __do_sys_lchown fs/open.c:729 [inline]
>        __se_sys_lchown fs/open.c:727 [inline]
>        __x64_sys_lchown+0x7a/0xc0 fs/open.c:727
>        do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>        entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> -> #1 (&ovl_i_mutex_dir_key[depth]){++++}-{3:3}:
>        down_read+0x96/0x420 kernel/locking/rwsem.c:1492
>        inode_lock_shared include/linux/fs.h:789 [inline]
>        lookup_slow fs/namei.c:1560 [inline]
>        walk_component+0x409/0x6a0 fs/namei.c:1860
>        lookup_last fs/namei.c:2309 [inline]
>        path_lookupat+0x1ba/0x830 fs/namei.c:2333
>        filename_lookup+0x19f/0x560 fs/namei.c:2366
>        create_local_trace_uprobe+0x87/0x4e0 kernel/trace/trace_uprobe.c:1574
>        perf_uprobe_init+0x132/0x210 kernel/trace/trace_event_perf.c:323
>        perf_uprobe_event_init+0xff/0x1c0 kernel/events/core.c:9580
>        perf_try_init_event+0x12a/0x560 kernel/events/core.c:10899
>        perf_init_event kernel/events/core.c:10951 [inline]
>        perf_event_alloc.part.0+0xdee/0x3770 kernel/events/core.c:11229
>        perf_event_alloc kernel/events/core.c:11608 [inline]
>        __do_sys_perf_event_open+0x72c/0x2cb0 kernel/events/core.c:11724
>        do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>        entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> -> #0 (&sig->exec_update_mutex){+.+.}-{3:3}:
>        check_prev_add kernel/locking/lockdep.c:2496 [inline]
>        check_prevs_add kernel/locking/lockdep.c:2601 [inline]
>        validate_chain kernel/locking/lockdep.c:3218 [inline]
>        __lock_acquire+0x2a6b/0x5640 kernel/locking/lockdep.c:4426
>        lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:5005
>        __mutex_lock_common kernel/locking/mutex.c:956 [inline]
>        __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
>        lock_trace fs/proc/base.c:408 [inline]
>        proc_pid_syscall+0xaa/0x2b0 fs/proc/base.c:646
>        proc_single_show+0x116/0x1e0 fs/proc/base.c:775
>        seq_read+0x432/0x1070 fs/seq_file.c:208
>        do_loop_readv_writev fs/read_write.c:734 [inline]
>        do_loop_readv_writev fs/read_write.c:721 [inline]
>        do_iter_read+0x48e/0x6e0 fs/read_write.c:955
>        vfs_readv+0xe5/0x150 fs/read_write.c:1073
>        do_preadv fs/read_write.c:1165 [inline]
>        __do_sys_preadv fs/read_write.c:1215 [inline]
>        __se_sys_preadv fs/read_write.c:1210 [inline]
>        __x64_sys_preadv+0x231/0x310 fs/read_write.c:1210
>        do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>        entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> other info that might help us debug this:
>
> Chain exists of:
>   &sig->exec_update_mutex --> sb_writers#4 --> &p->lock
>
>  Possible unsafe locking scenario:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(&p->lock);
>                                lock(sb_writers#4);
>                                lock(&p->lock);
>   lock(&sig->exec_update_mutex);
>
>  *** DEADLOCK ***
>
> 1 lock held by syz-executor.0/18445:
>  #0: ffff88808e9a3c30 (&p->lock){+.+.}-{3:3}, at: seq_read+0x61/0x1070 fs/seq_file.c:155
>
> stack backtrace:
> CPU: 0 PID: 18445 Comm: syz-executor.0 Not tainted 5.9.0-rc2-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x18f/0x20d lib/dump_stack.c:118
>  check_noncircular+0x324/0x3e0 kernel/locking/lockdep.c:1827
>  check_prev_add kernel/locking/lockdep.c:2496 [inline]
>  check_prevs_add kernel/locking/lockdep.c:2601 [inline]
>  validate_chain kernel/locking/lockdep.c:3218 [inline]
>  __lock_acquire+0x2a6b/0x5640 kernel/locking/lockdep.c:4426
>  lock_acquire+0x1f1/0xad0 kernel/locking/lockdep.c:5005
>  __mutex_lock_common kernel/locking/mutex.c:956 [inline]
>  __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
>  lock_trace fs/proc/base.c:408 [inline]
>  proc_pid_syscall+0xaa/0x2b0 fs/proc/base.c:646
>  proc_single_show+0x116/0x1e0 fs/proc/base.c:775
>  seq_read+0x432/0x1070 fs/seq_file.c:208
>  do_loop_readv_writev fs/read_write.c:734 [inline]
>  do_loop_readv_writev fs/read_write.c:721 [inline]
>  do_iter_read+0x48e/0x6e0 fs/read_write.c:955
>  vfs_readv+0xe5/0x150 fs/read_write.c:1073
>  do_preadv fs/read_write.c:1165 [inline]
>  __do_sys_preadv fs/read_write.c:1215 [inline]
>  __se_sys_preadv fs/read_write.c:1210 [inline]
>  __x64_sys_preadv+0x231/0x310 fs/read_write.c:1210
>  do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x45d5b9
> Code: 5d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007fb613f9ec78 EFLAGS: 00000246 ORIG_RAX: 0000000000000127
> RAX: ffffffffffffffda RBX: 0000000000025740 RCX: 000000000045d5b9
> RDX: 0000000000000333 RSI: 00000000200017c0 RDI: 0000000000000006
> RBP: 000000000118cf90 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000118cf4c
> R13: 00007ffe2a82bbbf R14: 00007fb613f9f9c0 R15: 000000000118cf4c
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@...glegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.