linux-kernel - Re: [syzbot] [bcachefs?] possible deadlock in trans_set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANp29Y4aEEHP79xwq0TXmtdXU-pYMRJ0ikOamxjSg0rv_EkZ-g@mail.gmail.com>
Date: Mon, 2 Dec 2024 15:01:42 +0100
From: Aleksandr Nogikh <nogikh@...gle.com>
To: Kent Overstreet <kent.overstreet@...ux.dev>
Cc: syzbot <syzbot+78f4eb354f5ca6c1e6eb@...kaller.appspotmail.com>, 
	linux-bcachefs@...r.kernel.org, linux-kernel@...r.kernel.org, 
	syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [bcachefs?] possible deadlock in trans_set_locked

On Fri, Nov 29, 2024 at 11:57 PM Kent Overstreet
<kent.overstreet@...ux.dev> wrote:
>
> On Fri, Nov 29, 2024 at 11:02:16PM +0100, Aleksandr Nogikh wrote:
> > Hi Kent,
> >
> > For reopened bugs, syzbot appends (2), (3), etc. at the end of the
> > title. In this case, there are no numbers, so it has never reported
> > anything with such a title before.
> >
> > But it can well be the case that the underlying problem here is the
> > same as in some other syzbot report (you could then "#syz dup" the new
> > to the older one). If you happen to see patterns in such duplicate
> > reports, please let us know and we'll try to improve the crash report
> > parsing logic.
>
> It looks identical to this one which I closed last night
>
> https://syzkaller.appspot.com/bug?extid=e088be3c2d5c05aaac35
>
> Is that a parsing issue? The lockdep splats don't just look similar to
> me, they look identical.

Yes, that's exactly a report parsing issue. In this case it's even one
that's a bit more involved than usually, so I've filed an issue to
discuss it in more detail:
https://github.com/google/syzkaller/issues/5558

>
> I've got another one that I closed last night that it seems might be
> confusing for syzbot:
> https://syzkaller.appspot.com/bug?extid=64e6509c7f777aec3a24
>
> I fixed the patch that introduced the bug (it was only in -next), but I
> don't seem to have a way to tell syzbot not to reopen it unless it sees
> the updated patch.

That's actually the default behavior of syzbot: if you set the fix
commit title via `#syz fix` or via a `Reported-by` tag, syzbot will
first wait until the fix commit has reached all the trees that are
fuzzed and will reopen the issue with a " (2)" suffix only if the
failure occurred on some patched tree.

However, syzbot parsed these two bug reports differently. It identified them as:
* possible deadlock in __bch2_trans_relock
* possible deadlock in trans_set_locked

So, from its viewpoint, these are totally "different".

If you know the exact duplicate issue, please send a #syz dup
command(s) to remove them from the web dashboard (and Cc
syzkaller@...glegroups.com so that we know that there was a parsing
problem).

-- 
Aleksandr

>
> Probably not a real issue with this particular bug - this exact situation
> is pretty rare, but I do have bugs accumulating in my dashboard that
> seem to have been fixed but I don't have a good way to close since I
> don't know the patch that fixed them (not going to bisect 20+ fixes...)

>
> >
> > --
> > Aleksandr
> >
> > On Fri, Nov 29, 2024 at 9:25 PM Kent Overstreet
> > <kent.overstreet@...ux.dev> wrote:
> > >
> > > On Fri, Nov 29, 2024 at 09:09:32AM -0800, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit:    7b1d1d4cfac0 Merge remote-tracking branch 'iommu/arm/smmu'..
> > > > git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=17d6af78580000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=9bc44a6de1ceb5d6
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=78f4eb354f5ca6c1e6eb
> > > > compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > userspace arch: arm64
> > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=107bdf5f980000
> > > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=13ae49e8580000
> > > >
> > > > Downloadable assets:
> > > > disk image: https://storage.googleapis.com/syzbot-assets/4d4a0162c7c3/disk-7b1d1d4c.raw.xz
> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/a8c47a4be472/vmlinux-7b1d1d4c.xz
> > > > kernel image: https://storage.googleapis.com/syzbot-assets/0e173b91f83e/Image-7b1d1d4c.gz.xz
> > > > mounted in repro #1: https://storage.googleapis.com/syzbot-assets/5ab7b24d2900/mount_0.gz
> > > > mounted in repro #2: https://storage.googleapis.com/syzbot-assets/fbfbb60588c1/mount_2.gz
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+78f4eb354f5ca6c1e6eb@...kaller.appspotmail.com
> > > >
> > > > ======================================================
> > > > WARNING: possible circular locking dependency detected
> > > > 6.12.0-syzkaller-g7b1d1d4cfac0 #0 Not tainted
> > > > ------------------------------------------------------
> > > > syz-executor203/6432 is trying to acquire lock:
> > > > ffff0000da100128 (bcachefs_btree){+.+.}-{0:0}, at: trans_set_locked+0x5c/0x21c fs/bcachefs/btree_locking.h:193
> > > >
> > > > but task is already holding lock:
> > > > ffff0000dc661548 (&c->fsck_error_msgs_lock){+.+.}-{3:3}, at: __bch2_fsck_err+0x344/0x2544 fs/bcachefs/error.c:282
> > > >
> > > > which lock already depends on the new lock.
> > > >
> > > >
> > > > the existing dependency chain (in reverse order) is:
> > > >
> > > > -> #1 (&c->fsck_error_msgs_lock){+.+.}-{3:3}:
> > > >        __mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
> > > >        __mutex_lock kernel/locking/mutex.c:752 [inline]
> > > >        mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
> > > >        __bch2_fsck_err+0x344/0x2544 fs/bcachefs/error.c:282
> > > >        bch2_check_alloc_hole_freespace+0x5fc/0xd74 fs/bcachefs/alloc_background.c:1278
> > > >        bch2_check_alloc_info+0x1174/0x26f8 fs/bcachefs/alloc_background.c:1547
> > > >        bch2_run_recovery_pass+0xe4/0x1d4 fs/bcachefs/recovery_passes.c:191
> > > >        bch2_run_online_recovery_passes+0xa4/0x174 fs/bcachefs/recovery_passes.c:212
> > > >        bch2_fsck_online_thread_fn+0x150/0x3e8 fs/bcachefs/chardev.c:799
> > > >        thread_with_stdio_fn+0x64/0x134 fs/bcachefs/thread_with_file.c:298
> > > >        kthread+0x288/0x310 kernel/kthread.c:389
> > > >        ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
> > > >
> > > > -> #0 (bcachefs_btree){+.+.}-{0:0}:
> > > >        check_prev_add kernel/locking/lockdep.c:3161 [inline]
> > > >        check_prevs_add kernel/locking/lockdep.c:3280 [inline]
> > > >        validate_chain kernel/locking/lockdep.c:3904 [inline]
> > > >        __lock_acquire+0x33f8/0x77c8 kernel/locking/lockdep.c:5202
> > > >        lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5825
> > > >        trans_set_locked+0x88/0x21c fs/bcachefs/btree_locking.h:194
> > > >        __bch2_trans_relock+0x2a0/0x394 fs/bcachefs/btree_locking.c:785
> > > >        bch2_trans_relock+0x24/0x34 fs/bcachefs/btree_locking.c:793
> > > >        __bch2_fsck_err+0x1664/0x2544 fs/bcachefs/error.c:363
> > > >        bch2_check_alloc_hole_freespace+0x5fc/0xd74 fs/bcachefs/alloc_background.c:1278
> > > >        bch2_check_alloc_info+0x1174/0x26f8 fs/bcachefs/alloc_background.c:1547
> > > >        bch2_run_recovery_pass+0xe4/0x1d4 fs/bcachefs/recovery_passes.c:191
> > > >        bch2_run_online_recovery_passes+0xa4/0x174 fs/bcachefs/recovery_passes.c:212
> > > >        bch2_fsck_online_thread_fn+0x150/0x3e8 fs/bcachefs/chardev.c:799
> > > >        thread_with_stdio_fn+0x64/0x134 fs/bcachefs/thread_with_file.c:298
> > > >        kthread+0x288/0x310 kernel/kthread.c:389
> > > >        ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
> > > >
> > > > other info that might help us debug this:
> > > >
> > > >  Possible unsafe locking scenario:
> > > >
> > > >        CPU0                    CPU1
> > > >        ----                    ----
> > > >   lock(&c->fsck_error_msgs_lock);
> > > >                                lock(bcachefs_btree);
> > > >                                lock(&c->fsck_error_msgs_lock);
> > > >   lock(bcachefs_btree);
> > > >
> > > >  *** DEADLOCK ***
> > > >
> > > > 3 locks held by syz-executor203/6432:
> > > >  #0: ffff0000dc600278 (&c->state_lock){++++}-{3:3}, at: bch2_run_online_recovery_passes+0x3c/0x174 fs/bcachefs/recovery_passes.c:204
> > > >  #1: ffff0000dc604398 (&c->btree_trans_barrier){.+.+}-{0:0}, at: srcu_lock_acquire+0x18/0x54 include/linux/srcu.h:150
> > > >  #2: ffff0000dc661548 (&c->fsck_error_msgs_lock){+.+.}-{3:3}, at: __bch2_fsck_err+0x344/0x2544 fs/bcachefs/error.c:282
> > > >
> > > > stack backtrace:
> > > > CPU: 1 UID: 0 PID: 6432 Comm: syz-executor203 Not tainted 6.12.0-syzkaller-g7b1d1d4cfac0 #0
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> > > > Call trace:
> > > >  show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:484 (C)
> > > >  __dump_stack lib/dump_stack.c:94 [inline]
> > > >  dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:120
> > > >  dump_stack+0x1c/0x28 lib/dump_stack.c:129
> > > >  print_circular_bug+0x154/0x1c0 kernel/locking/lockdep.c:2074
> > > >  check_noncircular+0x310/0x404 kernel/locking/lockdep.c:2206
> > > >  check_prev_add kernel/locking/lockdep.c:3161 [inline]
> > > >  check_prevs_add kernel/locking/lockdep.c:3280 [inline]
> > > >  validate_chain kernel/locking/lockdep.c:3904 [inline]
> > > >  __lock_acquire+0x33f8/0x77c8 kernel/locking/lockdep.c:5202
> > > >  lock_acquire+0x240/0x728 kernel/locking/lockdep.c:5825
> > > >  trans_set_locked+0x88/0x21c fs/bcachefs/btree_locking.h:194
> > > >  __bch2_trans_relock+0x2a0/0x394 fs/bcachefs/btree_locking.c:785
> > > >  bch2_trans_relock+0x24/0x34 fs/bcachefs/btree_locking.c:793
> > > >  __bch2_fsck_err+0x1664/0x2544 fs/bcachefs/error.c:363
> > > >  bch2_check_alloc_hole_freespace+0x5fc/0xd74 fs/bcachefs/alloc_background.c:1278
> > > >  bch2_check_alloc_info+0x1174/0x26f8 fs/bcachefs/alloc_background.c:1547
> > > >  bch2_run_recovery_pass+0xe4/0x1d4 fs/bcachefs/recovery_passes.c:191
> > > >  bch2_run_online_recovery_passes+0xa4/0x174 fs/bcachefs/recovery_passes.c:212
> > > >  bch2_fsck_online_thread_fn+0x150/0x3e8 fs/bcachefs/chardev.c:799
> > > >  thread_with_stdio_fn+0x64/0x134 fs/bcachefs/thread_with_file.c:298
> > > >  kthread+0x288/0x310 kernel/kthread.c:389
> > > >  ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:862
> > > >
> > > >
> > > > ---
> > > > This report is generated by a bot. It may contain errors.
> > > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > > syzbot engineers can be reached at syzkaller@...glegroups.com.
> > > >
> > > > syzbot will keep track of this issue. See:
> > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > > >
> > > > If the report is already addressed, let syzbot know by replying with:
> > > > #syz fix: exact-commit-title
> > > >
> > > > If you want syzbot to run the reproducer, reply with:
> > > > #syz test: git://repo/address.git branch-or-commit-hash
> > > > If you attach or paste a git patch, syzbot will apply it before testing.
> > > >
> > > > If you want to overwrite report's subsystems, reply with:
> > > > #syz set subsystems: new-subsystem
> > > > (See the list of subsystem names on the web dashboard)
> > > >
> > > > If the report is a duplicate of another one, reply with:
> > > > #syz dup: exact-subject-of-another-report
> > > >
> > > > If you want to undo deduplication, reply with:
> > > > #syz undup
> > >
> > > syzbot seems to now be re-opening bugs just because the patch hasn't
> > > been merged into the branch it's testing?
> > >