[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240829122131.xa44p2nfwqwptmtr@quack3>
Date: Thu, 29 Aug 2024 14:21:31 +0200
From: Jan Kara <jack@...e.cz>
To: Yang Erkun <yangerkun@...weicloud.com>
Cc: tytso@....edu, adilger.kernel@...ger.ca, jack@...e.cz,
linux-ext4@...r.kernel.org, yangerkun@...wei.com
Subject: Re: [PATCH v2] ext4: dax: keep orphan list before truncate overflow
allocated blocks
On Thu 29-08-24 19:02:22, Yang Erkun wrote:
> From: yangerkun <yangerkun@...wei.com>
>
> Any extending write for ext4 requires the inode to be placed on the
> orphan list before the actual write. In addition, the inode can be
> actually removed from the orphan list only after all writes are
> completed. Otherwise we'd leave allocated blocks beyond i_disksize if we
> could not copy all the data into allocated block and e2fsck would
> complain.
>
> Currently, direct IO and buffered IO comply with this logic(buffered
> IO will truncate all overflow allocated blocks that has not been
> written successfully, and direct IO will truncate all allocated blocks
> when error occurs). However, dax write break this since dax write will
> remove the inode from the orphan list by calling
> ext4_handle_inode_extension unconditionally during extending write.
>
> We add a argument to help determine does we do a fully write, and for
> the case not fully write, we leave the inode on the orphan list, and the
> latter ext4_inode_extension_cleanup will help us truncate the overflow
> allocated blocks, and then remove the inode from the orphan list.
>
> Signed-off-by: yangerkun <yangerkun@...wei.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@...e.cz>
Honza
> ---
> fs/ext4/file.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index be061bb64067..f14aed14b9cf 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -306,7 +306,7 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb,
> }
>
> static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset,
> - ssize_t count)
> + ssize_t written, ssize_t count)
> {
> handle_t *handle;
>
> @@ -315,7 +315,7 @@ static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset,
> if (IS_ERR(handle))
> return PTR_ERR(handle);
>
> - if (ext4_update_inode_size(inode, offset + count)) {
> + if (ext4_update_inode_size(inode, offset + written)) {
> int ret = ext4_mark_inode_dirty(handle, inode);
> if (unlikely(ret)) {
> ext4_journal_stop(handle);
> @@ -323,11 +323,11 @@ static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset,
> }
> }
>
> - if (inode->i_nlink)
> + if ((written == count) && inode->i_nlink)
> ext4_orphan_del(handle, inode);
> ext4_journal_stop(handle);
>
> - return count;
> + return written;
> }
>
> /*
> @@ -393,7 +393,7 @@ static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size,
> if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize) &&
> pos + size <= i_size_read(inode))
> return size;
> - return ext4_handle_inode_extension(inode, pos, size);
> + return ext4_handle_inode_extension(inode, pos, size, size);
> }
>
> static const struct iomap_dio_ops ext4_dio_write_ops = {
> @@ -669,7 +669,7 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
> ret = dax_iomap_rw(iocb, from, &ext4_iomap_ops);
>
> if (extend) {
> - ret = ext4_handle_inode_extension(inode, offset, ret);
> + ret = ext4_handle_inode_extension(inode, offset, ret, count);
> ext4_inode_extension_cleanup(inode, ret < (ssize_t)count);
> }
> out:
> --
> 2.39.2
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists