linux-kernel - Re: [PATCH v2 3/6] ext4: restart handle if credits are insufficient during allocating blocks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <00d60446-f380-4480-b643-2b63669ebccc@huaweicloud.com>
Date: Fri, 20 Jun 2025 13:00:32 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
 linux-kernel@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca,
 ojaswin@...ux.ibm.com, yi.zhang@...wei.com, libaokun1@...wei.com,
 yukuai3@...wei.com, yangerkun@...wei.com
Subject: Re: [PATCH v2 3/6] ext4: restart handle if credits are insufficient
 during allocating blocks

On 2025/6/20 0:33, Jan Kara wrote:
> On Wed 11-06-25 19:16:22, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@...wei.com>
>>
>> After large folios are supported on ext4, writing back a sufficiently
>> large and discontinuous folio may consume a significant number of
>> journal credits, placing considerable strain on the journal. For
>> example, in a 20GB filesystem with 1K block size and 1MB journal size,
>> writing back a 2MB folio could require thousands of credits in the
>> worst-case scenario (when each block is discontinuous and distributed
>> across different block groups), potentially exceeding the journal size.
>> This issue can also occur in ext4_write_begin() and ext4_page_mkwrite()
>> when delalloc is not enabled.
>>
>> Fix this by ensuring that there are sufficient journal credits before
>> allocating an extent in mpage_map_one_extent() and _ext4_get_block(). If
>> there are not enough credits, return -EAGAIN, exit the current mapping
>> loop, restart a new handle and a new transaction, and allocating blocks
>> on this folio again in the next iteration.
>>
>> Suggested-by: Jan Kara <jack@...e.cz>
>> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
> 
> ...
> 
>>  static int _ext4_get_block(struct inode *inode, sector_t iblock,
>>  			   struct buffer_head *bh, int flags)
>>  {
>>  	struct ext4_map_blocks map;
>> +	handle_t *handle = ext4_journal_current_handle();
>>  	int ret = 0;
>>  
>>  	if (ext4_has_inline_data(inode))
>>  		return -ERANGE;
>>  
>> +	/* Make sure transaction has enough credits for this extent */
>> +	if (flags & EXT4_GET_BLOCKS_CREATE) {
>> +		ret = ext4_journal_ensure_extent_credits(handle, inode);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +
>>  	map.m_lblk = iblock;
>>  	map.m_len = bh->b_size >> inode->i_blkbits;
>>  
>> -	ret = ext4_map_blocks(ext4_journal_current_handle(), inode, &map,
>> -			      flags);
>> +	ret = ext4_map_blocks(handle, inode, &map, flags);
> 
> Good spotting with ext4_page_mkwrite() and ext4_write_begin() also needing
> this treatment! But rather then hiding the transaction extension in
> _ext4_get_block() I'd do this in ext4_block_write_begin() where it is much
> more obvious (and also it is much more obvious who needs to be prepared for
> handling EAGAIN error). Otherwise the patch looks good!
> 

Yes, I completely agree with you. However, unfortunately, do this in
ext4_block_write_begin() only works for ext4_write_begin().
ext4_page_mkwrite() does not call ext4_block_write_begin() to allocate
blocks, it call the vfs helper __block_write_begin_int() instead.

vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
{
	...
	if (!ext4_should_journal_data(inode)) {
		err = block_page_mkwrite(vma, vmf, get_block);
	...
}


So...

Thanks,
Yi.