lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aSbpn7xquvdglW21@li-dc0c254c-257c-11b2-a85c-98b6c1322444.ibm.com>
Date: Wed, 26 Nov 2025 17:20:55 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: Zhang Yi <yi.zhang@...weicloud.com>
Cc: linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca,
        jack@...e.cz, yi.zhang@...wei.com, yizhang089@...il.com,
        libaokun1@...wei.com, yangerkun@...wei.com
Subject: Re: [PATCH v2 04/13] ext4: don't set EXT4_GET_BLOCKS_CONVERT when
 splitting before submitting I/O

On Fri, Nov 21, 2025 at 02:08:02PM +0800, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@...wei.com>
> 
> When allocating blocks during within-EOF DIO and writeback with
> dioread_nolock enabled, EXT4_GET_BLOCKS_PRE_IO was set to split an
> existing large unwritten extent. However, EXT4_GET_BLOCKS_CONVERT was
> set when calling ext4_split_convert_extents(), which may potentially
> result in stale data issues.
> 
> Assume we have an unwritten extent, and then DIO writes the second half.
> 
>    [UUUUUUUUUUUUUUUU] on-disk extent        U: unwritten extent
>    [UUUUUUUUUUUUUUUU] extent status tree
>             |<-   ->| ----> dio write this range
> 
> First, ext4_iomap_alloc() call ext4_map_blocks() with
> EXT4_GET_BLOCKS_PRE_IO, EXT4_GET_BLOCKS_UNWRIT_EXT and
> EXT4_GET_BLOCKS_CREATE flags set. ext4_map_blocks() find this extent and
> call ext4_split_convert_extents() with EXT4_GET_BLOCKS_CONVERT and the
> above flags set.
> 
> Then, ext4_split_convert_extents() calls ext4_split_extent() with
> EXT4_EXT_MAY_ZEROOUT, EXT4_EXT_MARK_UNWRIT2 and EXT4_EXT_DATA_VALID2
> flags set, and it calls ext4_split_extent_at() to split the second half
> with EXT4_EXT_DATA_VALID2, EXT4_EXT_MARK_UNWRIT1, EXT4_EXT_MAY_ZEROOUT
> and EXT4_EXT_MARK_UNWRIT2 flags set. However, ext4_split_extent_at()
> failed to insert extent since a temporary lack -ENOSPC. It zeroes out
> the first half but convert the entire on-disk extent to written since
> the EXT4_EXT_DATA_VALID2 flag set, but left the second half as unwritten
> in the extent status tree.
> 
>    [0000000000SSSSSS]  data                S: stale data, 0: zeroed
>    [WWWWWWWWWWWWWWWW]  on-disk extent      W: written extent
>    [WWWWWWWWWWUUUUUU]  extent status tree
> 
> Finally, if the DIO failed to write data to the disk, the stale data in
> the second half will be exposed once the cached extent entry is gone.
> 
> Fix this issue by not passing EXT4_GET_BLOCKS_CONVERT when splitting
> an unwritten extent before submitting I/O, and make
> ext4_split_convert_extents() to zero out the entire extent range
> to zero for this case, and also mark the extent in the extent status
> tree for consistency.

Hi Yi,

Your analysis makes sense to me and I agree that passing CONVERT flag
there might not have been correct since we are not neccessarily
converting the extent to initialized.

Other than that, feel free to add:
Reviewed-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com>

Regards,
ojaswin

> 
> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
> ---
>  fs/ext4/extents.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index cafe66cb562f..2db84f6b0056 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3733,11 +3733,15 @@ static struct ext4_ext_path *ext4_split_convert_extents(handle_t *handle,
>  	/* Convert to unwritten */
>  	if (flags & EXT4_GET_BLOCKS_CONVERT_UNWRITTEN) {
>  		split_flag |= EXT4_EXT_DATA_ENTIRE_VALID1;
> -	/* Convert to initialized */
> -	} else if (flags & EXT4_GET_BLOCKS_CONVERT) {
> +	/* Split the existing unwritten extent */
> +	} else if (flags & (EXT4_GET_BLOCKS_UNWRIT_EXT |
> +			    EXT4_GET_BLOCKS_CONVERT)) {
>  		split_flag |= ee_block + ee_len <= eof_block ?
>  			      EXT4_EXT_MAY_ZEROOUT : 0;
> -		split_flag |= (EXT4_EXT_MARK_UNWRIT2 | EXT4_EXT_DATA_VALID2);
> +		split_flag |= EXT4_EXT_MARK_UNWRIT2;
> +		/* Convert to initialized */
> +		if (flags & EXT4_GET_BLOCKS_CONVERT)
> +			split_flag |= EXT4_EXT_DATA_VALID2;
>  	}
>  	flags |= EXT4_GET_BLOCKS_PRE_IO;
>  	return ext4_split_extent(handle, inode, path, map, split_flag, flags,
> @@ -3913,7 +3917,7 @@ ext4_ext_handle_unwritten_extents(handle_t *handle, struct inode *inode,
>  	/* get_block() before submitting IO, split the extent */
>  	if (flags & EXT4_GET_BLOCKS_PRE_IO) {
>  		path = ext4_split_convert_extents(handle, inode, map, path,
> -				flags | EXT4_GET_BLOCKS_CONVERT, allocated);
> +						  flags, allocated);
>  		if (IS_ERR(path))
>  			return path;
>  		/*
> -- 
> 2.46.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ