linux-ext4 - Re: [PATCH] ext4: fix JBD2 credit overflow with large folios

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d7b0731-88c3-4114-a401-e6aa8a085c5f@huaweicloud.com>
Date: Mon, 30 Jun 2025 21:58:52 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Sasha Levin <sashal@...nel.org>
Cc: linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
 stable@...r.kernel.org, tytso@....edu
Subject: Re: [PATCH] ext4: fix JBD2 credit overflow with large folios

On 2025/6/30 21:13, Sasha Levin wrote:
> When large folios are enabled, the blocks-per-folio calculation in
> ext4_da_writepages_trans_blocks() can overflow the journal transaction
> limits, causing the writeback path to fail with errors like:
> 
>   JBD2: kworker/u8:0 wants too many credits credits:416 rsv_credits:21 max:334
> 
> This occurs with small block sizes (1KB) and large folios (32MB), where
> the calculation results in 32768 blocks per folio. The transaction credit
> calculation then requests more credits than the journal can handle, leading
> to the following warning and writeback failure:
> 
>   WARNING: CPU: 1 PID: 43 at fs/jbd2/transaction.c:334 start_this_handle+0x4c0/0x4e0
>   EXT4-fs (loop0): ext4_do_writepages: jbd2_start: 9223372036854775807 pages, ino 14; err -28
> 
> Call trace leading to the issue:
>   ext4_do_writepages()
>     ext4_da_writepages_trans_blocks()
>       bpp = ext4_journal_blocks_per_folio() // Returns 32768 for 32MB folio with 1KB blocks
>       ext4_meta_trans_blocks(inode, MAX_WRITEPAGES_EXTENT_LEN + bpp - 1, bpp)
>         // With bpp=32768, lblocks=34815, pextents=32768
>         // Returns credits=415, but with overhead becomes 416 > max 334
>     ext4_journal_start_with_reserve()
>       jbd2_journal_start_reserved()
>         start_this_handle()
>           // Fails with warning when credits:416 > max:334
> 
> The issue was introduced by commit d6bf294773a47 ("ext4/jbd2: convert
> jbd2_journal_blocks_per_page() to support large folio"), which added
> support for large folios but didn't account for the journal credit limits.
> 
> Fix this by capping the blocks-per-folio value at 8192 in the writeback
> path. This is the value we'd get with 32MB folios and 4KB blocks, or 8MB
> folios with 1KB blocks, which is reasonable and safe for typical journal
> configurations.
> 
> Fixes: d6bf294773a4 ("ext4/jbd2: convert jbd2_journal_blocks_per_page() to support large folio")
> Cc: stable@...r.kernel.org
> Signed-off-by: Sasha Levin <sashal@...nel.org>

Hi, Sasha!

Thank you for the fix. However, simply limiting the credits is not enough,
as this may result in a scenario where there are not enough credits
available to map a large, non-contiguous folio. I've been working on this
issue[1] and I'll release v3 tomorrow if my tests looks fine.

[1] https://lore.kernel.org/linux-ext4/20250611111625.1668035-1-yi.zhang@huaweicloud.com/

Thanks,
Yi.

> ---
>  fs/ext4/inode.c | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index be9a4cba35fd5..860e59a176c97 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2070,6 +2070,14 @@ static int mpage_submit_folio(struct mpage_da_data *mpd, struct folio *folio)
>   */
>  #define MAX_WRITEPAGES_EXTENT_LEN 2048
>  
> +/*
> + * Maximum blocks per folio to avoid JBD2 credit overflow.
> + * This is the value we'd get with 32MB folios and 4KB blocks,
> + * or 8MB folios with 1KB blocks, which is reasonable and safe
> + * for typical journal configurations.
> + */
> +#define MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK 8192
> +
>  /*
>   * mpage_add_bh_to_extent - try to add bh to extent of blocks to map
>   *
> @@ -2481,6 +2489,18 @@ static int ext4_da_writepages_trans_blocks(struct inode *inode)
>  {
>  	int bpp = ext4_journal_blocks_per_folio(inode);
>  
> +	/*
> +	 * With large folios, blocks per folio can get excessively large,
> +	 * especially with small block sizes. For example, with 32MB folios
> +	 * (order 11) and 1KB blocks, we get 32768 blocks per folio. This
> +	 * leads to credit requests that overflow the journal's transaction
> +	 * limit.
> +	 *
> +	 * Limit the value to avoid excessive credit requests.
> +	 */
> +	if (bpp > MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK)
> +		bpp = MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK;
> +
>  	return ext4_meta_trans_blocks(inode,
>  				MAX_WRITEPAGES_EXTENT_LEN + bpp - 1, bpp);
>  }
> @@ -2559,6 +2579,13 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
>  	handle_t *handle = NULL;
>  	int bpp = ext4_journal_blocks_per_folio(mpd->inode);
>  
> +	/*
> +	 * With large folios, blocks per folio can get excessively large,
> +	 * especially with small block sizes. Cap it to avoid credit overflow.
> +	 */
> +	if (bpp > MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK)
> +		bpp = MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK;
> +
>  	if (mpd->wbc->sync_mode == WB_SYNC_ALL || mpd->wbc->tagged_writepages)
>  		tag = PAGECACHE_TAG_TOWRITE;
>  	else
> @@ -6179,6 +6206,13 @@ int ext4_writepage_trans_blocks(struct inode *inode)
>  	int bpp = ext4_journal_blocks_per_folio(inode);
>  	int ret;
>  
> +	/*
> +	 * With large folios, blocks per folio can get excessively large,
> +	 * especially with small block sizes. Cap it to avoid credit overflow.
> +	 */
> +	if (bpp > MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK)
> +		bpp = MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK;
> +
>  	ret = ext4_meta_trans_blocks(inode, bpp, bpp);
>  
>  	/* Account for data blocks for journalled mode */