linux-kernel - Re: [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b9b93ad2-2253-6850-da38-afc42370303e@huaweicloud.com>
Date: Fri, 10 May 2024 11:39:48 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Luis Henriques <luis.henriques@...ux.dev>, Theodore Ts'o <tytso@....edu>
Cc: linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org, adilger.kernel@...ger.ca,
 jack@...e.cz, ritesh.list@...il.com, hch@...radead.org, djwong@...nel.org,
 willy@...radead.org, zokeefe@...gle.com, yi.zhang@...wei.com,
 chengzhihao1@...wei.com, yukuai3@...wei.com, wangkefeng.wang@...wei.com
Subject: Re: [PATCH v3 03/26] ext4: correct the hole length returned by
 ext4_map_blocks()

On 2024/5/10 1:23, Luis Henriques wrote:
> On Thu 09 May 2024 12:39:53 PM -04, Theodore Ts'o wrote;
> 
>> On Thu, May 09, 2024 at 04:16:34PM +0100, Luis Henriques wrote:
>>>
>>> It's looks like it's easy to trigger an infinite loop here using fstest
>>> generic/039.  If I understand it correctly (which doesn't happen as often
>>> as I'd like), this is due to an integer overflow in the 'if' condition,
>>> and should be fixed with the patch below.
>>
>> Thanks for the report.  However, I can't reproduce the failure, and
>> looking at generic/039, I don't see how it could be relevant to the
>> code path in question.  Generic/039 creates a test symlink with two
>> hard links in the same directory, syncs the file system, and then
>> removes one of the hard links, and then drops access to the block
>> device using dmflakey.  So I don't see how the extent code would be
>> involved at all.  Are you sure that you have the correct test listed?
> 
> Yep, I just retested and it's definitely generic/039.  I'm using a simple
> test environment, with virtme-ng.
> 
>> Looking at the code in question in fs/ext4/extents.c:
>>
>> again:
>> 	ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start,
>> 				  hole_start + len - 1, &es);
>> 	if (!es.es_len)
>> 		goto insert_hole;
>>
>>   	 * There's a delalloc extent in the hole, handle it if the delalloc
>>   	 * extent is in front of, behind and straddle the queried range.
>>   	 */
>>  -	if (lblk >= es.es_lblk + es.es_len) {
>>  +	if (lblk >= ((__u64) es.es_lblk) + es.es_len) {
>>   		/*
>>   		 * The delalloc extent is in front of the queried range,
>>   		 * find again from the queried start block.
>> 		len -= lblk - hole_start;
>> 		hole_start = lblk;
>> 		goto again;
>>
>> lblk and es.es_lblk are both __u32.  So the infinite loop is
>> presumably because es.es_lblk + es.es_len has overflowed.  This should
>> never happen(tm), and in fact we have a test for this case which
> 
> If I instrument the code, I can see that es.es_len is definitely set to
> EXT_MAX_BLOCKS, which will overflow.
> 

Thanks for the report. After looking at the code, I think the root
cause of this issue is the variable es was not initialized on replaying
fast commit. ext4_es_find_extent_range() will return directly when
EXT4_FC_REPLAY flag is set, and then the es.len becomes stall.

I can always reproduce this issue on generic/039 with
MKFS_OPTIONS="-O fast_commit".

This uninitialization problem originally existed in the old
ext4_ext_put_gap_in_cache(), but it didn't trigger any real problem
since we never check and use extent cache when replaying fast commit.
So I suppose the correct fix would be to unconditionally initialize
the es variable.

Thanks,
Yi.

>> *should* have gotten tripped when ext4_es_find_extent_range() calls
>> __es_tree_search() in fs/ext4/extents_status.c:
>>
>> static inline ext4_lblk_t ext4_es_end(struct extent_status *es)
>> {
>> 	BUG_ON(es->es_lblk + es->es_len < es->es_lblk);
>> 	return es->es_lblk + es->es_len - 1;
>> }
>>
>> So the patch is harmless, and I can see how it might fix what you were
>> seeing --- but I'm a bit nervous that I can't reproduce it and the
>> commit description claims that it reproduces easily; and we should
>> have never allowed the entry to have gotten introduced into the
>> extents status tree in the first place, and if it had been introduced,
>> it should have been caught before it was returned by
>> ext4_es_find_extent_range().
>>
>> Can you give more details about the reproducer; can you double check
>> the test id, and how easily you can trigger the failure, and what is
>> the hardware you used to run the test?
> 
> So, here's few more details that may clarify, and that I should have added
> to the commit description:
> 
> When the test hangs, the test is blocked mounting the flakey device:
> 
>    mount -t ext4 -o acl,user_xattr /dev/mapper/flakey-test /mnt/scratch
> 
> which will eventually call into ext4_ext_map_blocks(), triggering the bug.
> 
> Also, some more code instrumentation shows that after the call to
> ext4_ext_find_hole(), the 'hole_start' will be set to '1' and 'len' to
> '0xfffffffe'.  This '0xfffffffe' value is a bit odd, but it comes from the
> fact that, in ext4_ext_find_hole(), the call to
> ext4_ext_next_allocated_block() will return EXT_MAX_BLOCKS and 'len' will
> thus be set to 'EXT_MAX_BLOCKS - 1'.
> 
> Does this make sense?
> 
> Cheers,
>