[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d7b0731-88c3-4114-a401-e6aa8a085c5f@huaweicloud.com>
Date: Mon, 30 Jun 2025 21:58:52 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Sasha Levin <sashal@...nel.org>
Cc: linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org, tytso@....edu
Subject: Re: [PATCH] ext4: fix JBD2 credit overflow with large folios
On 2025/6/30 21:13, Sasha Levin wrote:
> When large folios are enabled, the blocks-per-folio calculation in
> ext4_da_writepages_trans_blocks() can overflow the journal transaction
> limits, causing the writeback path to fail with errors like:
>
> JBD2: kworker/u8:0 wants too many credits credits:416 rsv_credits:21 max:334
>
> This occurs with small block sizes (1KB) and large folios (32MB), where
> the calculation results in 32768 blocks per folio. The transaction credit
> calculation then requests more credits than the journal can handle, leading
> to the following warning and writeback failure:
>
> WARNING: CPU: 1 PID: 43 at fs/jbd2/transaction.c:334 start_this_handle+0x4c0/0x4e0
> EXT4-fs (loop0): ext4_do_writepages: jbd2_start: 9223372036854775807 pages, ino 14; err -28
>
> Call trace leading to the issue:
> ext4_do_writepages()
> ext4_da_writepages_trans_blocks()
> bpp = ext4_journal_blocks_per_folio() // Returns 32768 for 32MB folio with 1KB blocks
> ext4_meta_trans_blocks(inode, MAX_WRITEPAGES_EXTENT_LEN + bpp - 1, bpp)
> // With bpp=32768, lblocks=34815, pextents=32768
> // Returns credits=415, but with overhead becomes 416 > max 334
> ext4_journal_start_with_reserve()
> jbd2_journal_start_reserved()
> start_this_handle()
> // Fails with warning when credits:416 > max:334
>
> The issue was introduced by commit d6bf294773a47 ("ext4/jbd2: convert
> jbd2_journal_blocks_per_page() to support large folio"), which added
> support for large folios but didn't account for the journal credit limits.
>
> Fix this by capping the blocks-per-folio value at 8192 in the writeback
> path. This is the value we'd get with 32MB folios and 4KB blocks, or 8MB
> folios with 1KB blocks, which is reasonable and safe for typical journal
> configurations.
>
> Fixes: d6bf294773a4 ("ext4/jbd2: convert jbd2_journal_blocks_per_page() to support large folio")
> Cc: stable@...r.kernel.org
> Signed-off-by: Sasha Levin <sashal@...nel.org>
Hi, Sasha!
Thank you for the fix. However, simply limiting the credits is not enough,
as this may result in a scenario where there are not enough credits
available to map a large, non-contiguous folio. I've been working on this
issue[1] and I'll release v3 tomorrow if my tests looks fine.
[1] https://lore.kernel.org/linux-ext4/20250611111625.1668035-1-yi.zhang@huaweicloud.com/
Thanks,
Yi.
> ---
> fs/ext4/inode.c | 34 ++++++++++++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index be9a4cba35fd5..860e59a176c97 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2070,6 +2070,14 @@ static int mpage_submit_folio(struct mpage_da_data *mpd, struct folio *folio)
> */
> #define MAX_WRITEPAGES_EXTENT_LEN 2048
>
> +/*
> + * Maximum blocks per folio to avoid JBD2 credit overflow.
> + * This is the value we'd get with 32MB folios and 4KB blocks,
> + * or 8MB folios with 1KB blocks, which is reasonable and safe
> + * for typical journal configurations.
> + */
> +#define MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK 8192
> +
> /*
> * mpage_add_bh_to_extent - try to add bh to extent of blocks to map
> *
> @@ -2481,6 +2489,18 @@ static int ext4_da_writepages_trans_blocks(struct inode *inode)
> {
> int bpp = ext4_journal_blocks_per_folio(inode);
>
> + /*
> + * With large folios, blocks per folio can get excessively large,
> + * especially with small block sizes. For example, with 32MB folios
> + * (order 11) and 1KB blocks, we get 32768 blocks per folio. This
> + * leads to credit requests that overflow the journal's transaction
> + * limit.
> + *
> + * Limit the value to avoid excessive credit requests.
> + */
> + if (bpp > MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK)
> + bpp = MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK;
> +
> return ext4_meta_trans_blocks(inode,
> MAX_WRITEPAGES_EXTENT_LEN + bpp - 1, bpp);
> }
> @@ -2559,6 +2579,13 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd)
> handle_t *handle = NULL;
> int bpp = ext4_journal_blocks_per_folio(mpd->inode);
>
> + /*
> + * With large folios, blocks per folio can get excessively large,
> + * especially with small block sizes. Cap it to avoid credit overflow.
> + */
> + if (bpp > MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK)
> + bpp = MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK;
> +
> if (mpd->wbc->sync_mode == WB_SYNC_ALL || mpd->wbc->tagged_writepages)
> tag = PAGECACHE_TAG_TOWRITE;
> else
> @@ -6179,6 +6206,13 @@ int ext4_writepage_trans_blocks(struct inode *inode)
> int bpp = ext4_journal_blocks_per_folio(inode);
> int ret;
>
> + /*
> + * With large folios, blocks per folio can get excessively large,
> + * especially with small block sizes. Cap it to avoid credit overflow.
> + */
> + if (bpp > MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK)
> + bpp = MAX_BLOCKS_PER_FOLIO_FOR_WRITEBACK;
> +
> ret = ext4_meta_trans_blocks(inode, bpp, bpp);
>
> /* Account for data blocks for journalled mode */
Powered by blists - more mailing lists