linux-ext4 - Re: Ext4 iomap warning during btrfs/136 (yes, it's from btrfs test cases)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <869c9ca6-2799-4ae0-8490-f383d7e0564b@suse.com>
Date: Tue, 12 Aug 2025 07:44:09 +0930
From: Qu Wenruo <wqu@...e.com>
To: "Darrick J. Wong" <djwong@...nel.org>, Qu Wenruo <quwenruo.btrfs@....com>
Cc: Zhang Yi <yi.zhang@...weicloud.com>, Theodore Ts'o <tytso@....edu>,
 linux-ext4 <linux-ext4@...r.kernel.org>,
 linux-btrfs <linux-btrfs@...r.kernel.org>,
 "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>
Subject: Re: Ext4 iomap warning during btrfs/136 (yes, it's from btrfs test
 cases)



在 2025/8/12 01:19, Darrick J. Wong 写道:
> On Sun, Aug 10, 2025 at 07:36:48AM +0930, Qu Wenruo wrote:
>>
>>
>> 在 2025/8/9 18:39, Zhang Yi 写道:
>>> On 2025/8/9 6:11, Qu Wenruo wrote:
>>>> 在 2025/8/8 21:46, Theodore Ts'o 写道:
>>>>> On Fri, Aug 08, 2025 at 06:20:56PM +0930, Qu Wenruo wrote:
>>>>>>
>>>>>> 在 2025/8/8 17:22, Qu Wenruo 写道:
>>>>>>> Hi,
>>>>>>>
>>>>>>> [BACKGROUND]
>>>>>>> Recently I'm testing btrfs with 16KiB block size.
>>>>>>>
>>>>>>> Currently btrfs is artificially limiting subpage block size to 4K.
>>>>>>> But there is a simple patch to change it to support all block sizes <=
>>>>>>> page size in my branch:
>>>>>>>
>>>>>>> https://github.com/adam900710/linux/tree/larger_bs_support
>>>>>>>
>>>>>>> [IOMAP WARNING]
>>>>>>> And I'm running into a very weird kernel warning at btrfs/136, with 16K
>>>>>>> block size and 64K page size.
>>>>>>>
>>>>>>> The problem is, the problem happens with ext3 (using ext4 modeule) with
>>>>>>> 16K block size, and no btrfs is involved yet.
>>>>>
>>>>>
>>>>> Thanks for the bug report!  This looks like it's an issue with using
>>>>> indirect block-mapped file with a 16k block size.  I tried your
>>>>> reproducer using a 1k block size on an x86_64 system, which is how I
>>>>> test problem caused by the block size < page size.  It didn't
>>>>> reproduce there, so it looks like it really needs a 16k block size.
>>>>>
>>>>> Can you say something about what system were you running your testing
>>>>> on --- was it an arm64 system, or a powerpc 64 system (the two most
>>>>> common systems with page size > 4k)?  (I assume you're not trying to
>>>>> do this on an Itanic.  :-)   And was the page size 16k or 64k?
>>>>
>>>> The architecture is aarch64, the host board is Rock5B (cheap and fast enough), the test machine is a VM on that board, with ovmf as the UEFI firmware.
>>>>
>>>> The kernel is configured to use 64K page size, the *ext3* system is using 16K block size.
>>>>
>>>> Currently I tried the following combination with 64K page size and ext3, the result looks like the following
>>>>
>>>> - 2K block size
>>>> - 4K block size
>>>>     All fine
>>>>
>>>> - 8K block size
>>>> - 16K block size
>>>>     All the same kernel warning and never ending fsstress
>>>>
>>>> - 32K block size
>>>> - 64K block size
>>>>     All fine
>>>>
>>>> I am surprised as you that, not all subpage block size are having problems, just 2 of the less common combinations failed.
>>>>
>>>> And the most common ones (4K, page size) are all fine.
>>>>
>>>> Finally, if using ext4 not ext3, all combinations above are fine again.
>>>>
>>>> So I ran out of ideas why only 2 block sizes fail here...
>>>>
>>>
>>> This issue is caused by an overflow in the calculation of the hole's
>>> length on the forth-level depth for non-extent inodes. For a file system
>>> with a 4KB block size, the calculation will not overflow. For a 64KB
>>> block size, the queried position will not reach the fourth level, so this
>>> issue only occur on the filesystem with a 8KB and 16KB block size.
>>>
>>> Hi, Wenruo, could you try the following fix?
>>>
>>> diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
>>> index 7de327fa7b1c..d45124318200 100644
>>> --- a/fs/ext4/indirect.c
>>> +++ b/fs/ext4/indirect.c
>>> @@ -539,7 +539,7 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
>>>    	int indirect_blks;
>>>    	int blocks_to_boundary = 0;
>>>    	int depth;
>>> -	int count = 0;
>>> +	u64 count = 0;
>>>    	ext4_fsblk_t first_block = 0;
>>>
>>>    	trace_ext4_ind_map_blocks_enter(inode, map->m_lblk, map->m_len, flags);
>>> @@ -588,7 +588,7 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode,
>>>    		count++;
>>>    		/* Fill in size of a hole we found */
>>>    		map->m_pblk = 0;
>>> -		map->m_len = min_t(unsigned int, map->m_len, count);
>>> +		map->m_len = umin(map->m_len, count);
>>>    		goto cleanup;
>>>    	}
>>
>> It indeed solves the problem.
>>
>> Tested-by: Qu Wenruo <wqu@...e.com>
> 
> Can we get the relevant chunks of this test turned into a tests/ext4/
> fstest so that the ext4 developers have a regression test that doesn't
> require setting up btrfs, please?

Sure, although I can send out a ext4 specific test case for it, I'm 
definitely not the best one to explain why the problem happens.

Thus I believe Zhang Yi would be the best one to send the test case.



Another thing is, any ext3 run with 16K block size (that's if the system 
supports it) should trigger it with the existing test cases.

The biggest challenge is to get a system supporting 16k bs (aka page 
size >= 16K), so it has a high chance that for most people the new test 
case will mostly be NOTRUN.

Thanks,
Qu

> 
> --D
> 
>> Thanks,
>> Qu
>>
>>> Thanks,
>>> Yi.
>>>
>>
>>