[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aSbpn7xquvdglW21@li-dc0c254c-257c-11b2-a85c-98b6c1322444.ibm.com>
Date: Wed, 26 Nov 2025 17:20:55 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: Zhang Yi <yi.zhang@...weicloud.com>
Cc: linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca,
jack@...e.cz, yi.zhang@...wei.com, yizhang089@...il.com,
libaokun1@...wei.com, yangerkun@...wei.com
Subject: Re: [PATCH v2 04/13] ext4: don't set EXT4_GET_BLOCKS_CONVERT when
splitting before submitting I/O
On Fri, Nov 21, 2025 at 02:08:02PM +0800, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@...wei.com>
>
> When allocating blocks during within-EOF DIO and writeback with
> dioread_nolock enabled, EXT4_GET_BLOCKS_PRE_IO was set to split an
> existing large unwritten extent. However, EXT4_GET_BLOCKS_CONVERT was
> set when calling ext4_split_convert_extents(), which may potentially
> result in stale data issues.
>
> Assume we have an unwritten extent, and then DIO writes the second half.
>
> [UUUUUUUUUUUUUUUU] on-disk extent U: unwritten extent
> [UUUUUUUUUUUUUUUU] extent status tree
> |<- ->| ----> dio write this range
>
> First, ext4_iomap_alloc() call ext4_map_blocks() with
> EXT4_GET_BLOCKS_PRE_IO, EXT4_GET_BLOCKS_UNWRIT_EXT and
> EXT4_GET_BLOCKS_CREATE flags set. ext4_map_blocks() find this extent and
> call ext4_split_convert_extents() with EXT4_GET_BLOCKS_CONVERT and the
> above flags set.
>
> Then, ext4_split_convert_extents() calls ext4_split_extent() with
> EXT4_EXT_MAY_ZEROOUT, EXT4_EXT_MARK_UNWRIT2 and EXT4_EXT_DATA_VALID2
> flags set, and it calls ext4_split_extent_at() to split the second half
> with EXT4_EXT_DATA_VALID2, EXT4_EXT_MARK_UNWRIT1, EXT4_EXT_MAY_ZEROOUT
> and EXT4_EXT_MARK_UNWRIT2 flags set. However, ext4_split_extent_at()
> failed to insert extent since a temporary lack -ENOSPC. It zeroes out
> the first half but convert the entire on-disk extent to written since
> the EXT4_EXT_DATA_VALID2 flag set, but left the second half as unwritten
> in the extent status tree.
>
> [0000000000SSSSSS] data S: stale data, 0: zeroed
> [WWWWWWWWWWWWWWWW] on-disk extent W: written extent
> [WWWWWWWWWWUUUUUU] extent status tree
>
> Finally, if the DIO failed to write data to the disk, the stale data in
> the second half will be exposed once the cached extent entry is gone.
>
> Fix this issue by not passing EXT4_GET_BLOCKS_CONVERT when splitting
> an unwritten extent before submitting I/O, and make
> ext4_split_convert_extents() to zero out the entire extent range
> to zero for this case, and also mark the extent in the extent status
> tree for consistency.
Hi Yi,
Your analysis makes sense to me and I agree that passing CONVERT flag
there might not have been correct since we are not neccessarily
converting the extent to initialized.
Other than that, feel free to add:
Reviewed-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
Regards,
ojaswin
>
> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
> ---
> fs/ext4/extents.c | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index cafe66cb562f..2db84f6b0056 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3733,11 +3733,15 @@ static struct ext4_ext_path *ext4_split_convert_extents(handle_t *handle,
> /* Convert to unwritten */
> if (flags & EXT4_GET_BLOCKS_CONVERT_UNWRITTEN) {
> split_flag |= EXT4_EXT_DATA_ENTIRE_VALID1;
> - /* Convert to initialized */
> - } else if (flags & EXT4_GET_BLOCKS_CONVERT) {
> + /* Split the existing unwritten extent */
> + } else if (flags & (EXT4_GET_BLOCKS_UNWRIT_EXT |
> + EXT4_GET_BLOCKS_CONVERT)) {
> split_flag |= ee_block + ee_len <= eof_block ?
> EXT4_EXT_MAY_ZEROOUT : 0;
> - split_flag |= (EXT4_EXT_MARK_UNWRIT2 | EXT4_EXT_DATA_VALID2);
> + split_flag |= EXT4_EXT_MARK_UNWRIT2;
> + /* Convert to initialized */
> + if (flags & EXT4_GET_BLOCKS_CONVERT)
> + split_flag |= EXT4_EXT_DATA_VALID2;
> }
> flags |= EXT4_GET_BLOCKS_PRE_IO;
> return ext4_split_extent(handle, inode, path, map, split_flag, flags,
> @@ -3913,7 +3917,7 @@ ext4_ext_handle_unwritten_extents(handle_t *handle, struct inode *inode,
> /* get_block() before submitting IO, split the extent */
> if (flags & EXT4_GET_BLOCKS_PRE_IO) {
> path = ext4_split_convert_extents(handle, inode, map, path,
> - flags | EXT4_GET_BLOCKS_CONVERT, allocated);
> + flags, allocated);
> if (IS_ERR(path))
> return path;
> /*
> --
> 2.46.1
>
Powered by blists - more mailing lists