[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20180704002019.GC5104@thunk.org>
Date: Tue, 3 Jul 2018 20:20:19 -0400
From: "Theodore Y. Ts'o" <tytso@....edu>
To: Eric Whitney <enwlinux@...il.com>
Cc: linux-ext4@...r.kernel.org, Matthew Wilcox <willy@...radead.org>
Subject: Re: generic/347 data=journal regression in 4.17
(Added Matthew Wilcox to the CC list)
On Fri, Jun 15, 2018 at 11:45:46AM -0400, Eric Whitney wrote:
> As discussed in this week's ext4 concall, I'm seeing a regression for
> generic/347 in the data=journal test case when running xfstests-bld on
> 4.17. The test fails with the following output:
>
> QA output created by 347
> +/sbin/e2fsck: Unknown code ____ 251 while recovering journal of /dev/mapper/thin-vol
>
> The failure bisects to:
>
> errseq: Always report a writeback error once (b4678df184b3)
>
> This patch landed in v4.17-rc4. When it's reverted in v4.17, it's no longer
> possible to reproduce the error.
The bogus error was casued by a bug in e2fsck/journal.c. Once I fixed
sync_blockdev() to return the error using the kernel error return
conventions (we convert it back to an errcode_t higher up in the call
stack), we get the expected error:
/sbin/e2fsck: Input/output error while recovering journal of /dev/mapper/thin-vol
The problem is I'm not sure how to fix this in the kernel. In order
to keep the postgress folks, we can't just clear wb_err when the
dm_thin volume is grown, since there were previous errors. Unlike the
loop device case, the dm_thin device isn't getting reset in the test
generic/347. More space is added to the dm_thin device so that future
writes won't fail, but the previous writes definitely will fail, so
there's no obvious fix to dm-thin that won't break Postgres's
"interesting" error collection strategy.
So I think all we can do is have e2fsck deliberately throw away the
error by calling fsync on the block device. The unfortunate thing
about this is that this forces a SYNCHRONIZE CACHE command to be sent
to the storage device, which is otherwise unnecessary. I'll do this
for now (see below), since there appears to be no other alternative,
but I wonder if the right answer is we should add a new fnctl() which
allows the caller to collect the error and clear wb_err without
needing to send CACHE FLUSH to the device.
Willy, what do you think?
- Ted
--- a/lib/ext2fs/unix_io.c
+++ b/lib/ext2fs/unix_io.c
@@ -569,6 +569,14 @@ static errcode_t unix_open_channel(const char *name, int fd,
if (safe_getenv("UNIX_IO_FORCE_BOUNCE"))
flags |= IO_FLAG_FORCE_BOUNCE;
+#ifdef __linux__
+ /*
+ * We need to make sure any previous errors in the block
+ * device are thrown away, sigh. Scat, bogus error! Scat!!
+ */
+ (void) fsync(fd);
+#endif
+
retval = ext2fs_get_mem(sizeof(struct struct_io_channel), &io);
if (retval)
goto cleanup;
Powered by blists - more mailing lists