lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240829110222.126685-1-yangerkun@huaweicloud.com>
Date: Thu, 29 Aug 2024 19:02:22 +0800
From: Yang Erkun <yangerkun@...weicloud.com>
To: tytso@....edu,
	adilger.kernel@...ger.ca,
	jack@...e.cz
Cc: linux-ext4@...r.kernel.org,
	yangerkun@...wei.com,
	yangerkun@...weicloud.com
Subject: [PATCH v2] ext4: dax: keep orphan list before truncate overflow allocated blocks

From: yangerkun <yangerkun@...wei.com>

Any extending write for ext4 requires the inode to be placed on the
orphan list before the actual write. In addition, the inode can be
actually removed from the orphan list only after all writes are
completed. Otherwise we'd leave allocated blocks beyond i_disksize if we
could not copy all the data into allocated block and e2fsck would
complain.

Currently, direct IO and buffered IO comply with this logic(buffered
IO will truncate all overflow allocated blocks that has not been
written successfully, and direct IO will truncate all allocated blocks
when error occurs). However, dax write break this since dax write will
remove the inode from the orphan list by calling
ext4_handle_inode_extension unconditionally during extending write.

We add a argument to help determine does we do a fully write, and for
the case not fully write, we leave the inode on the orphan list, and the
latter ext4_inode_extension_cleanup will help us truncate the overflow
allocated blocks, and then remove the inode from the orphan list.

Signed-off-by: yangerkun <yangerkun@...wei.com>
---
 fs/ext4/file.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index be061bb64067..f14aed14b9cf 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -306,7 +306,7 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb,
 }
 
 static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset,
-					   ssize_t count)
+					   ssize_t written, ssize_t count)
 {
 	handle_t *handle;
 
@@ -315,7 +315,7 @@ static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset,
 	if (IS_ERR(handle))
 		return PTR_ERR(handle);
 
-	if (ext4_update_inode_size(inode, offset + count)) {
+	if (ext4_update_inode_size(inode, offset + written)) {
 		int ret = ext4_mark_inode_dirty(handle, inode);
 		if (unlikely(ret)) {
 			ext4_journal_stop(handle);
@@ -323,11 +323,11 @@ static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset,
 		}
 	}
 
-	if (inode->i_nlink)
+	if ((written == count) && inode->i_nlink)
 		ext4_orphan_del(handle, inode);
 	ext4_journal_stop(handle);
 
-	return count;
+	return written;
 }
 
 /*
@@ -393,7 +393,7 @@ static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size,
 	if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize) &&
 	    pos + size <= i_size_read(inode))
 		return size;
-	return ext4_handle_inode_extension(inode, pos, size);
+	return ext4_handle_inode_extension(inode, pos, size, size);
 }
 
 static const struct iomap_dio_ops ext4_dio_write_ops = {
@@ -669,7 +669,7 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	ret = dax_iomap_rw(iocb, from, &ext4_iomap_ops);
 
 	if (extend) {
-		ret = ext4_handle_inode_extension(inode, offset, ret);
+		ret = ext4_handle_inode_extension(inode, offset, ret, count);
 		ext4_inode_extension_cleanup(inode, ret < (ssize_t)count);
 	}
 out:
-- 
2.39.2


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ