linux-kernel - Re: [syzbot] [autofs?] possible deadlock in autofs_notify

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <19ab74d8-06c7-400d-9bbe-2c25a068bb44@themaw.net>
Date: Wed, 24 Jul 2024 08:47:50 +0800
From: Ian Kent <raven@...maw.net>
To: syzbot <syzbot+0d4e0978aa13f9e1db55@...kaller.appspotmail.com>,
 autofs@...r.kernel.org, linux-kernel@...r.kernel.org,
 syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [autofs?] possible deadlock in autofs_notify_daemon

On 22/7/24 07:57, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:

I'm having trouble understanding this, you'll need to to better at the 
explanation.


>
> HEAD commit:    d7e78951a8b8 Merge tag 'net-6.11-rc0' of git://git.kernel...
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1642f7a5980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=2602dfd9213d734c
> dashboard link: https://syzkaller.appspot.com/bug?extid=0d4e0978aa13f9e1db55
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>
> Unfortunately, I don't have any reproducer for this issue yet.

That might help since what the mounted file system is makes a difference.


Note that this notification is done to a specific user space process, 
and there is only one for a

given autofs file system mount, and all other processes are read-only 
within that autofs file

system. So I don't see how another process writing to a kernfs file can 
play a part in this.


>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/cdd2c14644df/disk-d7e78951.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/7f9c9ab39b87/vmlinux-d7e78951.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/1fc3658770e2/bzImage-d7e78951.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+0d4e0978aa13f9e1db55@...kaller.appspotmail.com
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.10.0-syzkaller-09703-gd7e78951a8b8 #0 Not tainted
> ------------------------------------------------------
> syz.3.4748/19551 is trying to acquire lock:
> ffff888059b0d940 (&sbi->pipe_mutex){+.+.}-{3:3}, at: autofs_write fs/autofs/waitq.c:55 [inline]
> ffff888059b0d940 (&sbi->pipe_mutex){+.+.}-{3:3}, at: autofs_notify_daemon+0x71f/0xf80 fs/autofs/waitq.c:164
>
> but task is already holding lock:
> ffff8880758e7888 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x1eb/0x500 fs/kernfs/file.c:325
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 (&of->mutex){+.+.}-{3:3}:
>         lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
>         __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>         __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>         kernfs_fop_write_iter+0x1eb/0x500 fs/kernfs/file.c:325
>         iter_file_splice_write+0xbd7/0x14e0 fs/splice.c:743
>         do_splice_from fs/splice.c:941 [inline]
>         do_splice+0xd77/0x1900 fs/splice.c:1354
>         __do_splice fs/splice.c:1436 [inline]
>         __do_sys_splice fs/splice.c:1652 [inline]
>         __se_sys_splice+0x331/0x4a0 fs/splice.c:1634
>         do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>         do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>         entry_SYSCALL_64_after_hwframe+0x77/0x7f

Is it really possible for some process to try and take lock that 
conflicts with a pipe owned

by a process that doesn't make calls into kernfs and will not open a 
file in kernfs. This pipe

is opened against an autofs file system.


I don't understand the scenario you'll need to help me out with some 
explanation of how such

an interaction can happen!


Ian

>
> -> #1 (&pipe->mutex){+.+.}-{3:3}:
>         lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
>         __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>         __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>         pipe_write+0x1c9/0x1a40 fs/pipe.c:455
>         __kernel_write_iter+0x47e/0x900 fs/read_write.c:523
>         __kernel_write+0x120/0x180 fs/read_write.c:543
>         autofs_write fs/autofs/waitq.c:57 [inline]
>         autofs_notify_daemon+0x732/0xf80 fs/autofs/waitq.c:164
>         autofs_wait+0x10b8/0x1b30 fs/autofs/waitq.c:426
>         autofs_do_expire_multi+0x659/0x950 fs/autofs/expire.c:590
>         autofs_root_ioctl+0x4c/0x60 fs/autofs/root.c:910
>         vfs_ioctl fs/ioctl.c:51 [inline]
>         __do_sys_ioctl fs/ioctl.c:907 [inline]
>         __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
>         do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>         do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>         entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> -> #0 (&sbi->pipe_mutex){+.+.}-{3:3}:
>         check_prev_add kernel/locking/lockdep.c:3133 [inline]
>         check_prevs_add kernel/locking/lockdep.c:3252 [inline]
>         validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3868
>         __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5142
>         lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
>         __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>         __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>         autofs_write fs/autofs/waitq.c:55 [inline]
>         autofs_notify_daemon+0x71f/0xf80 fs/autofs/waitq.c:164
>         autofs_wait+0x10b8/0x1b30 fs/autofs/waitq.c:426
>         autofs_mount_wait+0x170/0x330 fs/autofs/root.c:255
>         autofs_d_automount+0x555/0x710 fs/autofs/root.c:401
>         follow_automount fs/namei.c:1394 [inline]
>         __traverse_mounts+0x2ba/0x580 fs/namei.c:1439
>         traverse_mounts fs/namei.c:1468 [inline]
>         handle_mounts fs/namei.c:1571 [inline]
>         step_into+0x5e5/0x1080 fs/namei.c:1877
>         lookup_last fs/namei.c:2542 [inline]
>         path_lookupat+0x16f/0x450 fs/namei.c:2566
>         filename_lookup+0x256/0x610 fs/namei.c:2595
>         kern_path+0x35/0x50 fs/namei.c:2703
>         lookup_bdev+0xc5/0x290 block/bdev.c:1157
>         resume_store+0x1a0/0x710 kernel/power/hibernate.c:1235
>         kernfs_fop_write_iter+0x3a1/0x500 fs/kernfs/file.c:334
>         iter_file_splice_write+0xbd7/0x14e0 fs/splice.c:743
>         do_splice_from fs/splice.c:941 [inline]
>         direct_splice_actor+0x11e/0x220 fs/splice.c:1164
>         splice_direct_to_actor+0x58e/0xc90 fs/splice.c:1108
>         do_splice_direct_actor fs/splice.c:1207 [inline]
>         do_splice_direct+0x28c/0x3e0 fs/splice.c:1233
>         do_sendfile+0x56d/0xe20 fs/read_write.c:1295
>         __do_sys_sendfile64 fs/read_write.c:1362 [inline]
>         __se_sys_sendfile64+0x17c/0x1e0 fs/read_write.c:1348
>         do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>         do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>         entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> other info that might help us debug this:
>
> Chain exists of:
>    &sbi->pipe_mutex --> &pipe->mutex --> &of->mutex
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(&of->mutex);
>                                 lock(&pipe->mutex);
>                                 lock(&of->mutex);
>    lock(&sbi->pipe_mutex);
>
>   *** DEADLOCK ***
>
> 3 locks held by syz.3.4748/19551:
>   #0: ffff88801e524420 (sb_writers#8){.+.+}-{0:0}, at: direct_splice_actor+0x49/0x220 fs/splice.c:1163
>   #1: ffff8880758e7888 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x1eb/0x500 fs/kernfs/file.c:325
>   #2: ffff888017adb4b8 (kn->active#65){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x20f/0x500 fs/kernfs/file.c:326
>
> stack backtrace:
> CPU: 1 PID: 19551 Comm: syz.3.4748 Not tainted 6.10.0-syzkaller-09703-gd7e78951a8b8 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
> Call Trace:
>   <TASK>
>   __dump_stack lib/dump_stack.c:88 [inline]
>   dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
>   check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2186
>   check_prev_add kernel/locking/lockdep.c:3133 [inline]
>   check_prevs_add kernel/locking/lockdep.c:3252 [inline]
>   validate_chain+0x18e0/0x5900 kernel/locking/lockdep.c:3868
>   __lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5142
>   lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
>   __mutex_lock_common kernel/locking/mutex.c:608 [inline]
>   __mutex_lock+0x136/0xd70 kernel/locking/mutex.c:752
>   autofs_write fs/autofs/waitq.c:55 [inline]
>   autofs_notify_daemon+0x71f/0xf80 fs/autofs/waitq.c:164
>   autofs_wait+0x10b8/0x1b30 fs/autofs/waitq.c:426
>   autofs_mount_wait+0x170/0x330 fs/autofs/root.c:255
>   autofs_d_automount+0x555/0x710 fs/autofs/root.c:401
>   follow_automount fs/namei.c:1394 [inline]
>   __traverse_mounts+0x2ba/0x580 fs/namei.c:1439
>   traverse_mounts fs/namei.c:1468 [inline]
>   handle_mounts fs/namei.c:1571 [inline]
>   step_into+0x5e5/0x1080 fs/namei.c:1877
>   lookup_last fs/namei.c:2542 [inline]
>   path_lookupat+0x16f/0x450 fs/namei.c:2566
>   filename_lookup+0x256/0x610 fs/namei.c:2595
>   kern_path+0x35/0x50 fs/namei.c:2703
>   lookup_bdev+0xc5/0x290 block/bdev.c:1157
>   resume_store+0x1a0/0x710 kernel/power/hibernate.c:1235
>   kernfs_fop_write_iter+0x3a1/0x500 fs/kernfs/file.c:334
>   iter_file_splice_write+0xbd7/0x14e0 fs/splice.c:743
>   do_splice_from fs/splice.c:941 [inline]
>   direct_splice_actor+0x11e/0x220 fs/splice.c:1164
>   splice_direct_to_actor+0x58e/0xc90 fs/splice.c:1108
>   do_splice_direct_actor fs/splice.c:1207 [inline]
>   do_splice_direct+0x28c/0x3e0 fs/splice.c:1233
>   do_sendfile+0x56d/0xe20 fs/read_write.c:1295
>   __do_sys_sendfile64 fs/read_write.c:1362 [inline]
>   __se_sys_sendfile64+0x17c/0x1e0 fs/read_write.c:1348
>   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>   do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f75ec575b59
> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f75ebfde048 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
> RAX: ffffffffffffffda RBX: 00007f75ec706038 RCX: 00007f75ec575b59
> RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000004
> RBP: 00007f75ec5e4e5d R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
> R13: 000000000000006e R14: 00007f75ec706038 R15: 00007ffef2866ed8
>   </TASK>
> PM: Image not found (code -6)
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@...glegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup