linux-kernel - Re: [PATCH 5/7] ext4: Refactor zeroout path and handle all cases

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <berakgy2my7h2v5wfijozaucks2fykqhqaq6zdbaucy7cx5osq@gkzxv4snj2ug>
Date: Tue, 6 Jan 2026 16:31:23 +0100
From: Jan Kara <jack@...e.cz>
To: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
Cc: linux-ext4@...r.kernel.org, Theodore Ts'o <tytso@....edu>, 
	Ritesh Harjani <ritesh.list@...il.com>, Zhang Yi <yi.zhang@...wei.com>, Jan Kara <jack@...e.cz>, 
	libaokun1@...wei.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/7] ext4: Refactor zeroout path and handle all cases

On Sun 04-01-26 17:49:18, Ojaswin Mujoo wrote:
> Currently, zeroout is used as a fallback in case we fail to
> split/convert extents in the "traditional" modify-the-extent-tree way.
> This is essential to mitigate failures in critical paths like extent
> splitting during endio. However, the logic is very messy and not easy to
> follow. Further, the fragile use of various flags has made it prone to
> errors.
> 
> Refactor zeroout out logic by moving it up to ext4_split_extents().
> Further, zeroout correctly based on the type of conversion we want, ie:
> - unwritten to written: Zeroout everything around the mapped range.
> - unwritten to unwritten: Zeroout everything
> - written to unwritten: Zeroout only the mapped range.
> 
> Signed-off-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com>

...

> @@ -3383,11 +3440,12 @@ static struct ext4_ext_path *ext4_split_extent(handle_t *handle,
>  					       int split_flag, int flags,
>  					       unsigned int *allocated)
>  {
> -	ext4_lblk_t ee_block;
> +	ext4_lblk_t ee_block, orig_ee_block;
>  	struct ext4_extent *ex;
> -	unsigned int ee_len, depth;
> -	int unwritten;
> -	int split_flag1, flags1;
> +	unsigned int ee_len, orig_ee_len, depth;
> +	int unwritten, orig_unwritten;
> +	int split_flag1 = 0, flags1 = 0;
> +	int err = 0, orig_err;

Cannot orig_err be used uninitialized in this function? At least it isn't
obvious to me some of the branches setting it is always taken.

> @@ -3395,23 +3453,29 @@ static struct ext4_ext_path *ext4_split_extent(handle_t *handle,
>  	ee_len = ext4_ext_get_actual_len(ex);
>  	unwritten = ext4_ext_is_unwritten(ex);
>  
> +	orig_ee_block = ee_block;
> +	orig_ee_len = ee_len;
> +	orig_unwritten = unwritten;
> +
>  	/* Do not cache extents that are in the process of being modified. */
>  	flags |= EXT4_EX_NOCACHE;
>  
>  	if (map->m_lblk + map->m_len < ee_block + ee_len) {
> -		split_flag1 = split_flag & EXT4_EXT_MAY_ZEROOUT;
>  		flags1 = flags | EXT4_GET_BLOCKS_SPLIT_NOMERGE;
>  		if (unwritten)
>  			split_flag1 |= EXT4_EXT_MARK_UNWRIT1 |
>  				       EXT4_EXT_MARK_UNWRIT2;
> -		if (split_flag & EXT4_EXT_DATA_VALID2)
> -			split_flag1 |= map->m_lblk > ee_block ?
> -				       EXT4_EXT_DATA_PARTIAL_VALID1 :
> -				       EXT4_EXT_DATA_ENTIRE_VALID1;
>  		path = ext4_split_extent_at(handle, inode, path,
>  				map->m_lblk + map->m_len, split_flag1, flags1);
> -		if (IS_ERR(path))
> -			return path;
> +
> +		if (IS_ERR(path)) {
> +			orig_err = PTR_ERR(path);
> +			if (orig_err != -ENOSPC && orig_err != -EDQUOT &&
> +			    orig_err != -ENOMEM)
> +				return path;
> +
> +			goto try_zeroout;
> +		}
>  		/*
>  		 * Update path is required because previous ext4_split_extent_at
>  		 * may result in split of original leaf or extent zeroout.
> @@ -3427,22 +3491,68 @@ static struct ext4_ext_path *ext4_split_extent(handle_t *handle,
>  			ext4_free_ext_path(path);
>  			return ERR_PTR(-EFSCORRUPTED);
>  		}
> -		unwritten = ext4_ext_is_unwritten(ex);
>  	}
>  
>  	if (map->m_lblk >= ee_block) {
> -		split_flag1 = split_flag & EXT4_EXT_DATA_VALID2;
> +		split_flag1 = 0;
>  		if (unwritten) {
>  			split_flag1 |= EXT4_EXT_MARK_UNWRIT1;
> -			split_flag1 |= split_flag & (EXT4_EXT_MAY_ZEROOUT |
> -						     EXT4_EXT_MARK_UNWRIT2);
> +			split_flag1 |= split_flag & EXT4_EXT_MARK_UNWRIT2;
>  		}
> -		path = ext4_split_extent_at(handle, inode, path,
> -				map->m_lblk, split_flag1, flags);
> +		path = ext4_split_extent_at(handle, inode, path, map->m_lblk,
> +					    split_flag1, flags);
> +
> +		if (IS_ERR(path)) {
> +			orig_err = PTR_ERR(path);
> +			if (orig_err != -ENOSPC && orig_err != -EDQUOT &&
> +			    orig_err != -ENOMEM)
> +				return path;
> +
> +			goto try_zeroout;
> +		}
> +	}
> +
> +	if (!err)

Nothing touches 'err' in this function...

> +		goto out;
> +
> +try_zeroout:
> +	/*
> +	 * There was an error in splitting the extent, just zeroout and convert
> +	 * to initialize as a last resort
> +	 */
> +	if (split_flag & EXT4_EXT_MAY_ZEROOUT) {
> +		path = ext4_find_extent(inode, map->m_lblk, NULL, flags);
>  		if (IS_ERR(path))
>  			return path;
> +
> +		depth = ext_depth(inode);
> +		ex = path[depth].p_ext;
> +		ee_block = le32_to_cpu(ex->ee_block);
> +		ee_len = ext4_ext_get_actual_len(ex);
> +		unwritten = ext4_ext_is_unwritten(ex);
> +
> +		/*
> +		 * The extent to zeroout should have been unchanged
> +		 * but its not, just return error to caller
> +		 */
> +		if (WARN_ON(ee_block != orig_ee_block ||
> +			    ee_len != orig_ee_len ||
> +			    unwritten != orig_unwritten))
> +			return ERR_PTR(orig_err);
> +
> +		/*
> +		 * Something went wrong in zeroout, just return the
> +		 * original error
> +		 */
> +		if (ext4_split_extent_zeroout(handle, inode, path, map, flags))
> +			return ERR_PTR(orig_err);
>  	}

Also nothing seems to zero out orig_err in case zero out above succeeded.
What am I missing?

								Honza

>  
> +	/* There's an error and we can't zeroout, just return the err */
> +	return ERR_PTR(orig_err);
> +
> +out:
> +
>  	if (allocated) {
>  		if (map->m_lblk + map->m_len > ee_block + ee_len)
>  			*allocated = ee_len - (map->m_lblk - ee_block);
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR