lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230220122004.26555-1-hans.holmberg@wdc.com>
Date:   Mon, 20 Feb 2023 13:20:04 +0100
From:   Hans Holmberg <hans.holmberg@....com>
To:     Jaegeuk Kim <jaegeuk@...nel.org>, Chao Yu <chao@...nel.org>
Cc:     linux-f2fs-devel@...ts.sourceforge.net, damien.lemoal@....com,
        aravind.ramesh@....com, hans@...tronix.com,
        linux-kernel@...r.kernel.org, Hans Holmberg <hans.holmberg@....com>
Subject: [RFC PATCH] f2fs: preserve direct write semantics when buffering is forced

In some cases, e.g. for zoned block devices, direct writes are
forced into buffered writes that will populate the page cache
and be written out just like buffered io.

Direct reads, on the other hand, is supported for the zoned
block device case. This has the effect that applications
built for direct io will fill up the page cache with data
that will never be read, and that is a waste of resources.

If we agree that this is a problem, how do we fix it?

A) Supporting proper direct writes for zoned block devices would
be the best, but it is currently not supported (probably for
a good but non-obvious reason). Would it be feasible to
implement proper direct IO?

B) Avoid the cost of keeping unwanted data by syncing and throwing
out the cached pages for buffered O_DIRECT writes before completion.

This patch implements B) by reusing the code for how partial
block writes are flushed out on the "normal" direct write path.

Note that this changes the performance characteristics of f2fs
quite a bit.

Direct IO performance for zoned block devices is lower for
small writes after this patch, but this should be expected
with direct IO and in line with how f2fs behaves on top of
conventional block devices.

Another open question is if the flushing should be done for
all cases where buffered writes are forced.

Signed-off-by: Hans Holmberg <hans.holmberg@....com>
---
 fs/f2fs/file.c | 38 ++++++++++++++++++++++++++++++--------
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index ecbc8c135b49..4e57c37bce35 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -4513,6 +4513,19 @@ static const struct iomap_dio_ops f2fs_iomap_dio_write_ops = {
 	.end_io = f2fs_dio_write_end_io,
 };
 
+static void f2fs_flush_buffered_write(struct address_space *mapping,
+				      loff_t start_pos, loff_t end_pos)
+{
+	int ret;
+
+	ret = filemap_write_and_wait_range(mapping, start_pos, end_pos);
+	if (ret < 0)
+		return;
+	invalidate_mapping_pages(mapping,
+				 start_pos >> PAGE_SHIFT,
+				 end_pos >> PAGE_SHIFT);
+}
+
 static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
 				   bool *may_need_sync)
 {
@@ -4612,14 +4625,9 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
 
 			ret += ret2;
 
-			ret2 = filemap_write_and_wait_range(file->f_mapping,
-							    bufio_start_pos,
-							    bufio_end_pos);
-			if (ret2 < 0)
-				goto out;
-			invalidate_mapping_pages(file->f_mapping,
-						 bufio_start_pos >> PAGE_SHIFT,
-						 bufio_end_pos >> PAGE_SHIFT);
+			f2fs_flush_buffered_write(file->f_mapping,
+						  bufio_start_pos,
+						  bufio_end_pos);
 		}
 	} else {
 		/* iomap_dio_rw() already handled the generic_write_sync(). */
@@ -4717,8 +4725,22 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	inode_unlock(inode);
 out:
 	trace_f2fs_file_write_iter(inode, orig_pos, orig_count, ret);
+
 	if (ret > 0 && may_need_sync)
 		ret = generic_write_sync(iocb, ret);
+
+	/* If buffered IO was forced, flush and drop the data from
+	 * the page cache to preserve O_DIRECT semantics
+	 */
+	if (ret > 0 && !dio && (iocb->ki_flags & IOCB_DIRECT)) {
+		struct file *file = iocb->ki_filp;
+		loff_t end_pos = orig_pos + ret - 1;
+
+		f2fs_flush_buffered_write(file->f_mapping,
+					  orig_pos,
+					  end_pos);
+	}
+
 	return ret;
 }
 
-- 
2.25.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ