[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YFOtBqSR6wq41G1T@mit.edu>
Date: Thu, 18 Mar 2021 15:41:58 -0400
From: "Theodore Ts'o" <tytso@....edu>
To: Eric Whitney <enwlinux@...il.com>
Cc: linux-ext4@...r.kernel.org, willy@...radead.org
Subject: Re: generic/418 regression seen on 5.12-rc3
On Thu, Mar 18, 2021 at 02:16:13PM -0400, Eric Whitney wrote:
> As mentioned in today's ext4 concall, I've seen generic/418 fail from time to
> time when run on 5.12-rc3 and 5.12-rc1 kernels. This first occurred when
> running the 1k test case using kvm-xfstests. I was then able to bisect the
> failure to a patch landed in the -rc1 merge window:
>
> (bd8a1f3655a7) mm/filemap: support readpage splitting a page
>
> Typical test output resulting from a failure looks like:
>
> QA output created by 418
> +cmpbuf: offset 0: Expected: 0x1, got 0x0
> +[6:0] FAIL - comparison failed, offset 3072
> +diotest -w -b 512 -n 8 -i 4 failed at loop 0
> Silence is golden
> ...
>
> I've also been able to reproduce the failure on -rc3 in the 4k test case as
> well. The failure frequency there was 10 out of 100 runs. It was anywhere
> from 2 to 8 failures out of 100 runs in the 1k case.
FWIW, testing on a kernel which is -rc2 based (ext4.git's tip) I
wasn't able to see a failure using gce-xfstests using the ext4/4k,
ext4/1k, and xfs/1k test scenarios. This may be because of the I/O
timing for the persistent disk block device in GCE, or differences in
the number of CPU's or amount of memory available --- or in the kernel
configuration that was used to build it.
I'm currently retrying with -rc3, with and without the kernel debug
configs, to see if that makes any difference...
- Ted
Powered by blists - more mailing lists