lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240904102244.ekzzhbwnzbtpeyjw@quack3>
Date: Wed, 4 Sep 2024 12:22:44 +0200
From: Jan Kara <jack@...e.cz>
To: Zhang Yi <yi.zhang@...weicloud.com>
Cc: linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, tytso@....edu,
	adilger.kernel@...ger.ca, jack@...e.cz, ritesh.list@...il.com,
	yi.zhang@...wei.com, chengzhihao1@...wei.com, yukuai3@...wei.com
Subject: Re: [PATCH v3 06/12] ext4: update delalloc data reserve spcae in
 ext4_es_insert_extent()

On Tue 13-08-24 20:34:46, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@...wei.com>
> 
> Now that we update data reserved space for delalloc after allocating
> new blocks in ext4_{ind|ext}_map_blocks(), and if bigalloc feature is
> enabled, we also need to query the extents_status tree to calculate the
> exact reserved clusters. This is complicated now and it appears that
> it's better to do this job in ext4_es_insert_extent(), because
> __es_remove_extent() have already count delalloc blocks when removing
> delalloc extents and __revise_pending() return new adding pending count,
> we could update the reserved blocks easily in ext4_es_insert_extent().
> 
> We direct reduce the reserved cluster count when replacing a delalloc
> extent. However, thers are two special cases need to concern about the
> quota claiming when doing direct block allocation (e.g. from fallocate).
> 
> A),
> fallocate a range that covers a delalloc extent but start with
> non-delayed allocated blocks, e.g. a hole.
> 
>   hhhhhhh+ddddddd+ddddddd
>   ^^^^^^^^^^^^^^^^^^^^^^^  fallocate this range
> 
> Current ext4_map_blocks() can't always trim the extent since it may
> release i_data_sem before calling ext4_map_create_blocks() and raced by
> another delayed allocation. Hence the EXT4_GET_BLOCKS_DELALLOC_RESERVE
> may not set even when we are replacing a delalloc extent, without this
> flag set, the quota has already been claimed by ext4_mb_new_blocks(), so
> we should release the quota reservations instead of claim them again.
> 
> B),
> bigalloc feature is enabled, fallocate a range that contains non-delayed
> allocated blocks.
> 
>   |<         one cluster       >|
>   hhhhhhh+hhhhhhh+hhhhhhh+ddddddd
>   ^^^^^^^  fallocate this range
> 
> This case is similar to above case, the EXT4_GET_BLOCKS_DELALLOC_RESERVE
> flag is also not set.
> 
> Hence we should release the quota reservations if we replace a delalloc
> extent but without EXT4_GET_BLOCKS_DELALLOC_RESERVE set.
> 
> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>

I really like how simple this ended being! Feel free to add:

Reviewed-by: Jan Kara <jack@...e.cz>

								Honza

> ---
>  fs/ext4/extents.c        | 37 -------------------------------------
>  fs/ext4/extents_status.c | 25 ++++++++++++++++++++++++-
>  fs/ext4/indirect.c       |  7 -------
>  3 files changed, 24 insertions(+), 45 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 671dacd7c873..db8f9d79477c 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -4357,43 +4357,6 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
>  		goto out;
>  	}
>  
> -	/*
> -	 * Reduce the reserved cluster count to reflect successful deferred
> -	 * allocation of delayed allocated clusters or direct allocation of
> -	 * clusters discovered to be delayed allocated.  Once allocated, a
> -	 * cluster is not included in the reserved count.
> -	 */
> -	if (test_opt(inode->i_sb, DELALLOC) && allocated_clusters) {
> -		if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE) {
> -			/*
> -			 * When allocating delayed allocated clusters, simply
> -			 * reduce the reserved cluster count and claim quota
> -			 */
> -			ext4_da_update_reserve_space(inode, allocated_clusters,
> -							1);
> -		} else {
> -			ext4_lblk_t lblk, len;
> -			unsigned int n;
> -
> -			/*
> -			 * When allocating non-delayed allocated clusters
> -			 * (from fallocate, filemap, DIO, or clusters
> -			 * allocated when delalloc has been disabled by
> -			 * ext4_nonda_switch), reduce the reserved cluster
> -			 * count by the number of allocated clusters that
> -			 * have previously been delayed allocated.  Quota
> -			 * has been claimed by ext4_mb_new_blocks() above,
> -			 * so release the quota reservations made for any
> -			 * previously delayed allocated clusters.
> -			 */
> -			lblk = EXT4_LBLK_CMASK(sbi, map->m_lblk);
> -			len = allocated_clusters << sbi->s_cluster_bits;
> -			n = ext4_es_delayed_clu(inode, lblk, len);
> -			if (n > 0)
> -				ext4_da_update_reserve_space(inode, (int) n, 0);
> -		}
> -	}
> -
>  	/*
>  	 * Cache the extent and update transaction to commit on fdatasync only
>  	 * when it is _not_ an unwritten extent.
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index 0580bc4bc762..41adf0d69959 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -853,6 +853,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
>  	struct extent_status newes;
>  	ext4_lblk_t end = lblk + len - 1;
>  	int err1 = 0, err2 = 0, err3 = 0;
> +	int resv_used = 0, pending = 0;
>  	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
>  	struct extent_status *es1 = NULL;
>  	struct extent_status *es2 = NULL;
> @@ -891,7 +892,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
>  		pr = __alloc_pending(true);
>  	write_lock(&EXT4_I(inode)->i_es_lock);
>  
> -	err1 = __es_remove_extent(inode, lblk, end, NULL, es1);
> +	err1 = __es_remove_extent(inode, lblk, end, &resv_used, es1);
>  	if (err1 != 0)
>  		goto error;
>  	/* Free preallocated extent if it didn't get used. */
> @@ -921,9 +922,31 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
>  			__free_pending(pr);
>  			pr = NULL;
>  		}
> +		pending = err3;
>  	}
>  error:
>  	write_unlock(&EXT4_I(inode)->i_es_lock);
> +	/*
> +	 * Reduce the reserved cluster count to reflect successful deferred
> +	 * allocation of delayed allocated clusters or direct allocation of
> +	 * clusters discovered to be delayed allocated.  Once allocated, a
> +	 * cluster is not included in the reserved count.
> +	 *
> +	 * When direct allocating (from fallocate, filemap, DIO, or clusters
> +	 * allocated when delalloc has been disabled by ext4_nonda_switch())
> +	 * an extent either 1) contains delayed blocks but start with
> +	 * non-delayed allocated blocks (e.g. hole) or 2) contains non-delayed
> +	 * allocated blocks which belong to delayed allocated clusters when
> +	 * bigalloc feature is enabled, quota has already been claimed by
> +	 * ext4_mb_new_blocks(), so release the quota reservations made for
> +	 * any previously delayed allocated clusters instead of claim them
> +	 * again.
> +	 */
> +	resv_used += pending;
> +	if (resv_used)
> +		ext4_da_update_reserve_space(inode, resv_used,
> +				flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE);
> +
>  	if (err1 || err2 || err3 < 0)
>  		goto retry;
>  
> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
> index d8ca7f64f952..7404f0935c90 100644
> --- a/fs/ext4/indirect.c
> +++ b/fs/ext4/indirect.c
> @@ -652,13 +652,6 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
>  	ext4_update_inode_fsync_trans(handle, inode, 1);
>  	count = ar.len;
>  
> -	/*
> -	 * Update reserved blocks/metadata blocks after successful block
> -	 * allocation which had been deferred till now.
> -	 */
> -	if (flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE)
> -		ext4_da_update_reserve_space(inode, count, 1);
> -
>  got_it:
>  	map->m_flags |= EXT4_MAP_MAPPED;
>  	map->m_pblk = le32_to_cpu(chain[depth-1].key);
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ