linux-kernel - Re: [syzbot] kernel BUG in assertfail

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACT4Y+YrLLiaKnM3uVHZvRtj-UrDW-cwx4k6Lsh8no12nwvpNw@mail.gmail.com>
Date:   Mon, 31 May 2021 12:31:59 +0200
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Nikolay Borisov <nborisov@...e.com>
Cc:     syzbot <syzbot+a6bf271c02e4fe66b4e4@...kaller.appspotmail.com>,
        Chris Mason <clm@...com>, dsterba@...e.com,
        Josef Bacik <josef@...icpanda.com>,
        linux-btrfs@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>
Subject: Re: [syzbot] kernel BUG in assertfail

On Mon, May 31, 2021 at 11:27 AM Nikolay Borisov <nborisov@...e.com> wrote:
> >>>>>
> >>>>> syzbot found the following issue on:
> >>>>>
> >>>>> HEAD commit:    1434a312 Merge branch 'for-5.13-fixes' of git://git.kernel..
> >>>>> git tree:       upstream
> >>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=162843f3d00000
> >>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=9f3da44a01882e99
> >>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=a6bf271c02e4fe66b4e4
> >>>>>
> >>>>> Unfortunately, I don't have any reproducer for this issue yet.
> >>>>>
> >>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>>>> Reported-by: syzbot+a6bf271c02e4fe66b4e4@...kaller.appspotmail.com
> >>>>>
> >>>>> assertion failed: !memcmp(fs_info->fs_devices->fsid, fs_info->super_copy->fsid, BTRFS_FSID_SIZE), in fs/btrfs/disk-io.c:3282
> >>>>
> >>>> This means a device contains a btrfs filesystem which has a different
> >>>> FSID in its superblock than the fsid which all devices part of the same
> >>>> fs_devices should have. This can happen in 2 ways - memory corruption
> >>>> where either of the ->fsid member are corrupted or if there was a crash
> >>>> while a filesystem's fsid was being changed. We need more context about
> >>>> what the test did?
> >>>
> >>> Hi Nikolay,
> >>>
> >>> From a semantic point of view we can consider that it just mounts /dev/random.
> >>> If syzbot comes up with a reproducer it will post it, but you seem to
> >>> already figure out what happened, so I assume you can write a unit
> >>> test for this.
> >>>
> >>
> >> Well no, under normal circumstances this shouldn't trigger. So if syzbot
> >> is doing something stupid as mounting /dev/random then I don't see a
> >> problem here. The assert is there to catch inconsistencies during normal
> >> operation which doesn't seem to be the case here.
> >
> >
> > Does this mean that CONFIG_BTRFS_ASSERT needs to be disabled in any testing?
> > What is it intended for? Or it can only be enabled when mounting known
> > good images? But then I assume even btrfs unit tests mount some
> > invalid images, so it would mean it can't be used even  during unit
> > testing?
> >
> > Looking at the output of "grep ASSERT fs/btrfs/*.c" it looks like most
> > of these actually check for something that "must never happen". E.g.
> > some lists/pointers are empty/non-empty in particular states. And
> > "must never happen" checks are for testing scenarios...
> >
> > Taking this particular FSID mismatch assert, should such corrupted
> > images be mounted for end users? Should users be notified? Currently
> > they are mounted and users are not notified, what is the purpose of
> > this assertion?
> >
> > Perhaps CONFIG_BTRFS_ASSERT needs to be split into "must never happen"
> > checks that are enabled during testing and normal if's with pr_err for
> > user notifications?
>
> After going through the code you've convinced me. I just sent a patch
> turning the 2 debugging asserts into full-fledged checks in
> validate_super. So now the correct behavior is to prevent mounting of
> such images.  How can I force syzbot to retest with the given patch applied?

syzbot can test patches for issues with reproducers:
http://bit.do/syzbot#testing-patches
but this issue doesn't have a reproducer unfortunately. But I hope
this change is going to be reasonably straightforward. And if/when
this issue happens again after this report is closed with a fix,
syzbot will notify us again. So an absence of any new reports from
syzbot will implicitly mean that everything is fine.