[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <lsq.1479082460.232546348@decadent.org.uk>
Date: Mon, 14 Nov 2016 00:14:20 +0000
From: Ben Hutchings <ben@...adent.org.uk>
To: linux-kernel@...r.kernel.org, stable@...r.kernel.org
CC: akpm@...ux-foundation.org, "Jan Kara" <jack@...e.cz>,
"Theodore Ts'o" <tytso@....edu>
Subject: [PATCH 3.16 032/346] ext4: fix deadlock during page writeback
3.16.39-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara <jack@...e.cz>
commit 646caa9c8e196880b41cd3e3d33a2ebc752bdb85 upstream.
Commit 06bd3c36a733 (ext4: fix data exposure after a crash) uncovered a
deadlock in ext4_writepages() which was previously much harder to hit.
After this commit xfstest generic/130 reproduces the deadlock on small
filesystems.
The problem happens when ext4_do_update_inode() sets LARGE_FILE feature
and marks current inode handle as synchronous. That subsequently results
in ext4_journal_stop() called from ext4_writepages() to block waiting for
transaction commit while still holding page locks, reference to io_end,
and some prepared bio in mpd structure each of which can possibly block
transaction commit from completing and thus results in deadlock.
Fix the problem by releasing page locks, io_end reference, and
submitting prepared bio before calling ext4_journal_stop().
[ Changed to defer the call to ext4_journal_stop() only if the handle
is synchronous. --tytso ]
Reported-and-tested-by: Eryu Guan <eguan@...hat.com>
Signed-off-by: Theodore Ts'o <tytso@....edu>
Signed-off-by: Jan Kara <jack@...e.cz>
Signed-off-by: Ben Hutchings <ben@...adent.org.uk>
---
fs/ext4/inode.c | 29 ++++++++++++++++++++++++++---
1 file changed, 26 insertions(+), 3 deletions(-)
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2610,13 +2610,36 @@ retry:
done = true;
}
}
- ext4_journal_stop(handle);
+ /*
+ * Caution: If the handle is synchronous,
+ * ext4_journal_stop() can wait for transaction commit
+ * to finish which may depend on writeback of pages to
+ * complete or on page lock to be released. In that
+ * case, we have to wait until after after we have
+ * submitted all the IO, released page locks we hold,
+ * and dropped io_end reference (for extent conversion
+ * to be able to complete) before stopping the handle.
+ */
+ if (!ext4_handle_valid(handle) || handle->h_sync == 0) {
+ ext4_journal_stop(handle);
+ handle = NULL;
+ }
/* Submit prepared bio */
ext4_io_submit(&mpd.io_submit);
/* Unlock pages we didn't use */
mpage_release_unused_pages(&mpd, give_up_on_write);
- /* Drop our io_end reference we got from init */
- ext4_put_io_end(mpd.io_submit.io_end);
+ /*
+ * Drop our io_end reference we got from init. We have
+ * to be careful and use deferred io_end finishing if
+ * we are still holding the transaction as we can
+ * release the last reference to io_end which may end
+ * up doing unwritten extent conversion.
+ */
+ if (handle) {
+ ext4_put_io_end_defer(mpd.io_submit.io_end);
+ ext4_journal_stop(handle);
+ } else
+ ext4_put_io_end(mpd.io_submit.io_end);
if (ret == -ENOSPC && sbi->s_journal) {
/*
Powered by blists - more mailing lists