linux-kernel - RE: [f2fs-dev] [RFC PATCH] f2fs: preserve direct write semantics when buffering is forced

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230222110833epcms2p8906f628cfa52910867e4c5ed649f791d@epcms2p8>
Date:   Wed, 22 Feb 2023 20:08:33 +0900
From:   Yonggil Song <yonggil.song@...sung.com>
To:     "hans.holmberg@....com" <hans.holmberg@....com>,
        Jaegeuk Kim <jaegeuk@...nel.org>, Chao Yu <chao@...nel.org>
CC:     "damien.lemoal@....com" <damien.lemoal@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-f2fs-devel@...ts.sourceforge.net" 
        <linux-f2fs-devel@...ts.sourceforge.net>
Subject: RE: [f2fs-dev] [RFC PATCH] f2fs: preserve direct write semantics
 when buffering is forced

>In some cases, e.g. for zoned block devices, direct writes are
>forced into buffered writes that will populate the page cache
>and be written out just like buffered io.
>
>Direct reads, on the other hand, is supported for the zoned
>block device case. This has the effect that applications
>built for direct io will fill up the page cache with data
>that will never be read, and that is a waste of resources.
>
>If we agree that this is a problem, how do we fix it?

I agree

thanks

>
>A) Supporting proper direct writes for zoned block devices would
>be the best, but it is currently not supported (probably for
>a good but non-obvious reason). Would it be feasible to
>implement proper direct IO?
>
>B) Avoid the cost of keeping unwanted data by syncing and throwing
>out the cached pages for buffered O_DIRECT writes before completion.
>
>This patch implements B) by reusing the code for how partial
>block writes are flushed out on the "normal" direct write path.
>
>Note that this changes the performance characteristics of f2fs
>quite a bit.
>
>Direct IO performance for zoned block devices is lower for
>small writes after this patch, but this should be expected
>with direct IO and in line with how f2fs behaves on top of
>conventional block devices.
>
>Another open question is if the flushing should be done for
>all cases where buffered writes are forced.
>
>Signed-off-by: Hans Holmberg <hans.holmberg@....com>
Reviewed-by: Yonggil Song <yonggil.song@...sung.com>
>---
> fs/f2fs/file.c | 38 ++++++++++++++++++++++++++++++--------
> 1 file changed, 30 insertions(+), 8 deletions(-)
>
>diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
>index ecbc8c135b49..4e57c37bce35 100644
>--- a/fs/f2fs/file.c
>+++ b/fs/f2fs/file.c
>@@ -4513,6 +4513,19 @@ static const struct iomap_dio_ops f2fs_iomap_dio_write_ops = {
> 	.end_io = f2fs_dio_write_end_io,
> };
> 
>+static void f2fs_flush_buffered_write(struct address_space *mapping,
>+				      loff_t start_pos, loff_t end_pos)
>+{
>+	int ret;
>+
>+	ret = filemap_write_and_wait_range(mapping, start_pos, end_pos);
>+	if (ret < 0)
>+		return;
>+	invalidate_mapping_pages(mapping,
>+				 start_pos >> PAGE_SHIFT,
>+				 end_pos >> PAGE_SHIFT);
>+}
>+
> static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
> 				   bool *may_need_sync)
> {
>@@ -4612,14 +4625,9 @@ static ssize_t f2fs_dio_write_iter(struct kiocb *iocb, struct iov_iter *from,
> 
> 			ret += ret2;
> 
>-			ret2 = filemap_write_and_wait_range(file->f_mapping,
>-							    bufio_start_pos,
>-							    bufio_end_pos);
>-			if (ret2 < 0)
>-				goto out;
>-			invalidate_mapping_pages(file->f_mapping,
>-						 bufio_start_pos >> PAGE_SHIFT,
>-						 bufio_end_pos >> PAGE_SHIFT);
>+			f2fs_flush_buffered_write(file->f_mapping,
>+						  bufio_start_pos,
>+						  bufio_end_pos);
> 		}
> 	} else {
> 		/* iomap_dio_rw() already handled the generic_write_sync(). */
>@@ -4717,8 +4725,22 @@ static ssize_t f2fs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
> 	inode_unlock(inode);
> out:
> 	trace_f2fs_file_write_iter(inode, orig_pos, orig_count, ret);
>+
> 	if (ret > 0 && may_need_sync)
> 		ret = generic_write_sync(iocb, ret);
>+
>+	/* If buffered IO was forced, flush and drop the data from
>+	 * the page cache to preserve O_DIRECT semantics
>+	 */
>+	if (ret > 0 && !dio && (iocb->ki_flags & IOCB_DIRECT)) {
>+		struct file *file = iocb->ki_filp;
>+		loff_t end_pos = orig_pos + ret - 1;
>+
>+		f2fs_flush_buffered_write(file->f_mapping,
>+					  orig_pos,
>+					  end_pos);
>+	}
>+
> 	return ret;
> }
> 
>-- 
>2.25.1