[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACT4Y+aHgz9cPa7OnVsNeHim72i6zVdjnbvVb0Z1oN2B8QLZqg@mail.gmail.com>
Date: Fri, 5 Jul 2019 15:18:06 +0200
From: Dmitry Vyukov <dvyukov@...gle.com>
To: "Theodore Ts'o" <tytso@....edu>,
syzbot <syzbot+4bfbbf28a2e50ab07368@...kaller.appspotmail.com>,
Andreas Dilger <adilger.kernel@...ger.ca>,
David Miller <davem@...emloft.net>, eladr@...lanox.com,
Ido Schimmel <idosch@...lanox.com>,
Jiri Pirko <jiri@...lanox.com>,
John Stultz <john.stultz@...aro.org>,
linux-ext4@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
netdev <netdev@...r.kernel.org>,
syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
Thomas Gleixner <tglx@...utronix.de>
Cc: syzkaller <syzkaller@...glegroups.com>
Subject: Re: INFO: rcu detected stall in ext4_write_checks
On Wed, Jun 26, 2019 at 8:43 PM Theodore Ts'o <tytso@....edu> wrote:
>
> On Wed, Jun 26, 2019 at 10:27:08AM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: abf02e29 Merge tag 'pm-5.2-rc6' of git://git.kernel.org/pu..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1435aaf6a00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=e5c77f8090a3b96b
> > dashboard link: https://syzkaller.appspot.com/bug?extid=4bfbbf28a2e50ab07368
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11234c41a00000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15d7f026a00000
> >
> > The bug was bisected to:
> >
> > commit 0c81ea5db25986fb2a704105db454a790c59709c
> > Author: Elad Raz <eladr@...lanox.com>
> > Date: Fri Oct 28 19:35:58 2016 +0000
> >
> > mlxsw: core: Add port type (Eth/IB) set API
>
> Um, so this doesn't pass the laugh test.
>
> > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=10393a89a00000
>
> It looks like the automated bisection machinery got confused by two
> failures getting triggered by the same repro; the symptoms changed
> over time. Initially, the failure was:
>
> crashed: INFO: rcu detected stall in {sys_sendfile64,ext4_file_write_iter}
>
> Later, the failure changed to something completely different, and much
> earlier (before the test was even started):
>
> run #5: basic kernel testing failed: failed to copy test binary to VM: failed to run ["scp" "-P" "22" "-F" "/dev/null" "-o" "UserKnownHostsFile=/dev/null" "-o" "BatchMode=yes" "-o" "IdentitiesOnly=yes" "-o" "StrictHostKeyChecking=no" "-o" "ConnectTimeout=10" "-i" "/syzkaller/jobs/linux/workdir/image/key" "/tmp/syz-executor216456474" "root@...128.15.205:./syz-executor216456474"]: exit status 1
> Connection timed out during banner exchange
> lost connection
>
> Looks like an opportunity to improve the bisection engine?
Hi Ted,
Yes, these infrastructure errors plague bisections episodically.
That's https://github.com/google/syzkaller/issues/1250
It did not confuse bisection explicitly as it understands that these
are infrastructure failures rather then a kernel crash, e.g. here you
may that it correctly identified that this run was OK and started
bisection in v4.10 v4.9 range besides 2 scp failures:
testing release v4.9
testing commit 69973b830859bc6529a7a0468ba0d80ee5117826 with gcc (GCC) 5.5.0
run #0: basic kernel testing failed: failed to copy test binary to VM:
failed to run ["scp" ...]: exit status 1
Connection timed out during banner exchange
run #1: basic kernel testing failed: failed to copy test binary to VM:
failed to run ["scp" ....]: exit status 1
Connection timed out during banner exchange
run #2: OK
run #3: OK
run #4: OK
run #5: OK
run #6: OK
run #7: OK
run #8: OK
run #9: OK
# git bisect start v4.10 v4.9
Though, of course, it may confuse bisection indirectly by reducing
number of tests per commit.
So far I wasn't able to gather any significant info about these
failures. We gather console logs, but on these runs they are empty.
It's easy to blame everything onto GCE but I don't have any bit of
information that would point either way. These failures just appear
randomly in production and usually in batches...
Powered by blists - more mailing lists