lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7241eee3-8d6b-4034-a124-c04e9d06614a@suse.com>
Date: Mon, 14 Apr 2025 11:32:29 +0800
From: Heming Zhao <heming.zhao@...e.com>
To: Joseph Qi <joseph.qi@...ux.alibaba.com>
Cc: ocfs2-devel@...ts.linux.dev, linux-kernel@...r.kernel.org,
 akpm <akpm@...ux-foundation.org>, gautham.ananthakrishna@...cle.com
Subject: Re: [PATCH 1/1] ocfs2: fix the issue with discontiguous allocation in
 the global_bitmap

Hi Joseph,

On 4/13/25 09:06, Joseph Qi wrote:
> 
> 
> On 2025/4/8 23:23, Heming Zhao wrote:
>> The commit 4eb7b93e0310 ("ocfs2: improve write IO performance when
>> fragmentation is high") introduced another regression.
>>
>> The following ocfs2-test case can trigger this issue:
>>> ${RESV_UNWRITTEN_BIN} -f ${WORK_PLACE}/large_testfile -s 0 -l \
>>> $((${FILE_MAJOR_SIZE_M}*1024*1024))
>>
>> In my env, test disk size (by "fdisk -l <dev>"):
>>> 53687091200 bytes, 104857600 sectors.
>>
>> Above command is:
>>> /usr/local/ocfs2-test/bin/resv_unwritten -f \
>>> /mnt/ocfs2/ocfs2-activate-discontig-bg-dir/large_testfile -s 0 -l \
>>> 53187969024
>>
>> Error log:
>>> [*] Reserve 50724M space for a LARGE file, reserve 200M space for future test.
>>> ioctl error 28: "No space left on device"
>>> resv allocation failed Unknown error -1
>>> reserve unwritten region from 0 to 53187969024.
>>
>> Call flow:
>> __ocfs2_change_file_space //by ioctl OCFS2_IOC_RESVSP64
>>   ocfs2_allocate_unwritten_extents //start:0 len:53187969024
>>    while()
>>     + ocfs2_get_clusters //cpos:0, alloc_size:1623168 (cluster number)
>>     + ocfs2_extend_allocation
>>       + ocfs2_lock_allocators
>>       |  + choose OCFS2_AC_USE_MAIN & ocfs2_cluster_group_search
>>       |
>>       + ocfs2_add_inode_data
>>          ocfs2_add_clusters_in_btree
>>           __ocfs2_claim_clusters
>>            ocfs2_claim_suballoc_bits
>>            + During the allocation of the final part of the large file
>> 	    (around 47GB), no chain had the required contiguous
>>              bits_wanted. Consequently, the allocation failed.
>>
>> How to fix:
>> When OCFS2 is encountering fragmented allocation, the file system should
>> stop attempting bits_wanted contiguous allocation and instead provide the
>> largest available contiguous free bits from the cluster groups.
>>
>> Signed-off-by: Heming Zhao <heming.zhao@...e.com>
>> Fixes: 4eb7b93e0310 ("ocfs2: improve write IO performance when fragmentation is high")
>> ---
>>   fs/ocfs2/suballoc.c | 36 +++++++++++++++++++++++++++++-------
>>   1 file changed, 29 insertions(+), 7 deletions(-)
>>
>> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
>> index fde75f2af37a..2e1689fc6cf7 100644
>> --- a/fs/ocfs2/suballoc.c
>> +++ b/fs/ocfs2/suballoc.c
>> @@ -1796,6 +1796,7 @@ static int ocfs2_search_chain(struct ocfs2_alloc_context *ac,
>>   {
>>   	int status;
>>   	u16 chain;
>> +	u32 contig_bits;
>>   	u64 next_group;
>>   	struct inode *alloc_inode = ac->ac_inode;
>>   	struct buffer_head *group_bh = NULL;
>> @@ -1821,10 +1822,23 @@ static int ocfs2_search_chain(struct ocfs2_alloc_context *ac,
>>   	status = -ENOSPC;
>>   	/* for now, the chain search is a bit simplistic. We just use
>>   	 * the 1st group with any empty bits. */
>> -	while ((status = ac->ac_group_search(alloc_inode, group_bh,
>> -					     bits_wanted, min_bits,
>> -					     ac->ac_max_block,
>> -					     res)) == -ENOSPC) {
>> +	while (1) {
>> +		if (ac->ac_which == OCFS2_AC_USE_MAIN_DISCONTIG) {
>> +			contig_bits = le16_to_cpu(bg->bg_contig_free_bits);
>> +			if (!contig_bits)
>> +				contig_bits = ocfs2_find_max_contig_free_bits(bg->bg_bitmap,
>> +						le16_to_cpu(bg->bg_bits), 0);
>> +			if (bits_wanted > contig_bits)
>> +				bits_wanted = contig_bits;
>> +			if (bits_wanted < min_bits)
>> +				bits_wanted = min_bits;
> 
> This seems only valid when bits_wanted changed?

Good catch. the correct style:

if (bits_wanted > contig_bits && contig_bits > min_bits)
         bits_wanted = contig_bits;

- Heming

> 
> BTW, the previous fix hasn't been merged yet:
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/fs/ocfs2?h=next-20250411&id=767ba8b7cba3ca7a560d632f267f7aea35d54810
> 
> So should we drop that first and fold it into a whole one?
> 
> Thanks,
> Joseph

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ