lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180406001325.GA133204@google.com>
Date:   Thu, 5 Apr 2018 17:13:25 -0700
From:   Eric Biggers <ebiggers@...gle.com>
To:     Dave Chinner <david@...morbit.com>
Cc:     Matthew Wilcox <willy@...radead.org>,
        "Theodore Y. Ts'o" <tytso@....edu>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        syzbot <syzbot+dc5ab2babdf22ca091af@...kaller.appspotmail.com>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        syzkaller-bugs@...glegroups.com, Al Viro <viro@...iv.linux.org.uk>
Subject: Re: WARNING in up_write

On Fri, Apr 06, 2018 at 08:32:26AM +1000, Dave Chinner wrote:
> On Wed, Apr 04, 2018 at 08:24:54PM -0700, Matthew Wilcox wrote:
> > On Wed, Apr 04, 2018 at 11:22:00PM -0400, Theodore Y. Ts'o wrote:
> > > On Wed, Apr 04, 2018 at 12:35:04PM -0700, Matthew Wilcox wrote:
> > > > On Wed, Apr 04, 2018 at 09:24:05PM +0200, Dmitry Vyukov wrote:
> > > > > On Tue, Apr 3, 2018 at 4:01 AM, syzbot
> > > > > <syzbot+dc5ab2babdf22ca091af@...kaller.appspotmail.com> wrote:
> > > > > > DEBUG_LOCKS_WARN_ON(sem->owner != get_current())
> > > > > > WARNING: CPU: 1 PID: 4441 at kernel/locking/rwsem.c:133 up_write+0x1cc/0x210
> > > > > > kernel/locking/rwsem.c:133
> > > > > > Kernel panic - not syncing: panic_on_warn set ...
> > > > 
> > > > Message-Id: <1522852646-2196-1-git-send-email-longman@...hat.com>
> > > >
> > > 
> > > We were way ahead of syzbot in this case.  :-)
> > 
> > Not really ... syzbot caught it Monday evening ;-)
> 
> Rather than arguing over who reported it first, I think that time
> would be better spent reflecting on why the syzbot report was
> completely ignored until *after* Ted diagnosed the issue
> independently and Waiman had already fixed it....
> 
> Clearly there is scope for improvement here.
> 
> Cheers,
> 

Well, ultimately a human needed to investigate the syzbot bug report to figure
out what was really going on.  In my view, the largest problem is that there are
simply too many bugs, so many are getting ignored.  If there were only a few
bugs, then Dmitry would investigate each one and send a "real" bug report of
better quality than the automated system can provide, or even send a fix
directly.  But in reality, on the same day this bug was reported, syzbot also
found 10 other bugs, and in the previous 2 days it had found 38 more.  No single
person can keep up with that.  You can see the current bug list, which has 172
open bugs, on the dashboard at https://syzkaller.appspot.com/.  Yes, the kernel
really is that broken.  Though, of course most bugs are in specific modules, not
the core kernel.

And although quite a few of these bugs will end up to be duplicates or even
already fixed, a human still has to look at each one to figure that out.
(Though, I do think that syzbot should try to automatically detect when a
reproducible bug was already fixed, via bisection.  It would cause a few bugs to
be incorrectly considered fixed, but it may be a worthwhile tradeoff.)

These bugs are all over the kernel as well, so most developers don't see the big
picture but rather just see a few bugs for "their" subsystem on "their"
subsystem's mailing list and sometimes demand special attention.  Of course,
it's great when people suggest ways to improve the process.  But it's not great
when people just don't feel responsible for fixing bugs and wait for
Someone Else to do it.

I'm hoping that in the future the syzbot "team", which seems to actually be just
Dmitry now, can get more resources towards helping fix the bugs.  But either
way, in the end Linux is a community effort.

Note also that syzbot wasn't super useful in this particular case because people
running xfstests came across the same bug.  But, this is actually a rare case.
Most syzbot bug reports have been for weird corner cases or races that no one
ever thought of before, so there are no existing tests that find them.

Thanks,

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ