linux-ext4 - [RFC PATCH 07/11] ext4: grab page before starting transaction handle in ext4_convert_inline_data_to

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20200423233705.5878-8-mfo@canonical.com>
Date:   Thu, 23 Apr 2020 20:37:01 -0300
From:   Mauricio Faria de Oliveira <mfo@...onical.com>
To:     linux-ext4@...r.kernel.org, "Theodore Y. Ts'o" <tytso@....edu>
Cc:     dann frazier <dann.frazier@...onical.com>,
        Andreas Dilger <adilger@...ger.ca>, Jan Kara <jack@...e.com>
Subject: [RFC PATCH 07/11] ext4: grab page before starting transaction handle in ext4_convert_inline_data_to_extent()

Since even just grab_cache_page_write_begin() might deadlock with
page writeback from __ext4_journalled_writepage() if stable pages
are required (as it does "lock_page(); wait_for_stable_page();")
and the handle to the same/running transaction is currently held,
it _cannot_ be called between ext4_journal_start/stop() as usual,
we have to be careful.

We have two options:

1) open-code the first part of grab_cache_page_write_begin()
   (before wait_for_stable_pages()) just before calling it,
   then check for deadlock and retry (a bit ugly but valid.)

2) move it before ext4_journal_start() to avoid the deadlock.

Option 2 has been done as optimization to ext4_write_begin()
in commit 47564bfb95bf ("ext4: grab page before starting
transaction handle in write_begin()"), and can similarly
apply to this case.

Since it sounds more official, counts as optimization that
prevents long transaction hold time, and isn't ugly, do it.

Signed-off-by: Mauricio Faria de Oliveira <mfo@...onical.com>
---
 fs/ext4/inline.c | 48 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index f35e289e17aa..5fd275098d10 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -548,23 +548,42 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
 	if (ret)
 		return ret;
 
-retry:
+	/*
+	 * grab_cache_page_write_begin() can take a long time if the
+	 * system is thrashing due to memory pressure, or if the page
+	 * is being written back.  So grab it first before we start
+	 * the transaction handle.  This also allows us to allocate
+	 * the page (if needed) without using GFP_NOFS.
+	 */
+retry_grab:
+	page = grab_cache_page_write_begin(mapping, 0, flags);
+	if (!page) {
+		ret = -ENOMEM;
+		handle = NULL;
+		goto out;
+	}
+	unlock_page(page);
+
+retry_journal:
 	handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, needed_blocks);
 	if (IS_ERR(handle)) {
+		put_page(page);
 		ret = PTR_ERR(handle);
 		handle = NULL;
+		page = NULL;
 		goto out;
 	}
 
-	/* We cannot recurse into the filesystem as the transaction is already
-	 * started */
-	flags |= AOP_FLAG_NOFS;
-
-	page = grab_cache_page_write_begin(mapping, 0, flags);
-	if (!page) {
-		ret = -ENOMEM;
-		goto out;
+	lock_page(page);
+	if (page->mapping != mapping) {
+		/* The page got truncated from under us */
+		unlock_page(page);
+		put_page(page);
+		ext4_journal_stop(handle);
+		goto retry_grab;
 	}
+	/* In case writeback began while the page was unlocked */
+	wait_for_stable_page(page);
 
 	ext4_write_lock_xattr(inode, &no_expand);
 	sem_held = 1;
@@ -600,8 +619,6 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
 
 	if (ret) {
 		unlock_page(page);
-		put_page(page);
-		page = NULL;
 		ext4_orphan_add(handle, inode);
 		ext4_write_unlock_xattr(inode, &no_expand);
 		sem_held = 0;
@@ -616,10 +633,13 @@ static int ext4_convert_inline_data_to_extent(struct address_space *mapping,
 		 */
 		if (inode->i_nlink)
 			ext4_orphan_del(NULL, inode);
-	}
 
-	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
-		goto retry;
+		if (ret == -ENOSPC &&
+		    ext4_should_retry_alloc(inode->i_sb, &retries))
+			goto retry_journal;
+		put_page(page);
+		page = NULL;
+	}
 
 	if (page)
 		block_commit_write(page, from, to);
-- 
2.20.1