lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200423233705.5878-10-mfo@canonical.com>
Date:   Thu, 23 Apr 2020 20:37:03 -0300
From:   Mauricio Faria de Oliveira <mfo@...onical.com>
To:     linux-ext4@...r.kernel.org, "Theodore Y. Ts'o" <tytso@....edu>
Cc:     dann frazier <dann.frazier@...onical.com>,
        Andreas Dilger <adilger@...ger.ca>, Jan Kara <jack@...e.com>
Subject: [RFC PATCH 09/11] ext4: grab page before starting transaction handle in ext4_try_to_write_inline_data()

Since even just grab_cache_page_write_begin() might deadlock with
page writeback from __ext4_journalled_writepage() if stable pages
are required (as it does "lock_page(); wait_for_stable_page();")
and the handle to the same/running transaction is currently held,
it _cannot_ be called between ext4_journal_start/stop() as usual,
we have to be careful.

We have two options:

1) open-code the first part of grab_cache_page_write_begin()
   (before wait_for_stable_pages()) just before calling it,
   then check for deadlock and retry (a bit ugly but valid.)

2) move it before ext4_journal_start() to avoid the deadlock.

Option 2 has been done as optimization to ext4_write_begin()
in commit 47564bfb95bf ("ext4: grab page before starting
transaction handle in write_begin()"), and can similarly
apply to this case.

Since it sounds more official, counts as optimization that
prevents long transaction hold time, and isn't ugly, do it.

Signed-off-by: Mauricio Faria de Oliveira <mfo@...onical.com>
---
 fs/ext4/inline.c | 46 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 37 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 10665b066bb7..9bb85e06854d 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -692,37 +692,65 @@ int ext4_try_to_write_inline_data(struct address_space *mapping,
 	if (ret)
 		return ret;
 
+	/*
+	 * grab_cache_page_write_begin() can take a long time if the
+	 * system is thrashing due to memory pressure, or if the page
+	 * is being written back.  So grab it first before we start
+	 * the transaction handle.  This also allows us to allocate
+	 * the page (if needed) without using GFP_NOFS.
+	 */
+retry_grab:
+	page = grab_cache_page_write_begin(mapping, 0, flags);
+	if (!page) {
+		ret = -ENOMEM;
+		handle = NULL;
+		goto out;
+	}
+	unlock_page(page);
+
 	/*
 	 * The possible write could happen in the inode,
 	 * so try to reserve the space in inode first.
 	 */
 	handle = ext4_journal_start(inode, EXT4_HT_INODE, 1);
 	if (IS_ERR(handle)) {
+		put_page(page);
 		ret = PTR_ERR(handle);
 		handle = NULL;
 		goto out;
 	}
 
+	lock_page(page);
+	if (page->mapping != mapping) {
+		/* The page got truncated from under us */
+		unlock_page(page);
+		put_page(page);
+		ext4_journal_stop(handle);
+		goto retry_grab;
+	}
+	/* In case writeback began while the page was unlocked */
+	wait_for_stable_page(page);
+
 	ret = ext4_prepare_inline_data(handle, inode, pos + len);
-	if (ret && ret != -ENOSPC)
+	if (ret && ret != -ENOSPC) {
+		unlock_page(page);
+		put_page(page);
 		goto out;
+	}
 
 	/* We don't have space in inline inode, so convert it to extent. */
 	if (ret == -ENOSPC) {
+		unlock_page(page);
+		put_page(page);
 		ext4_journal_stop(handle);
 		brelse(iloc.bh);
 		goto convert;
 	}
 
 	ret = ext4_journal_get_write_access(handle, iloc.bh);
-	if (ret)
-		goto out;
-
-	flags |= AOP_FLAG_NOFS;
-
-	page = grab_cache_page_write_begin(mapping, 0, flags);
-	if (!page) {
-		ret = -ENOMEM;
+	if (ret) {
+		unlock_page(page);
+		put_page(page);
 		goto out;
 	}
 
-- 
2.20.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ