linux-kernel - Re: [PATCH v3 15/17] ext4: convert free groups order lists to xarrays

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <iulwol5ygqv7fry543vuoawhn7fjzlz7hmai5stjxqkkvvz6pc@wukeepjempwn>
Date: Mon, 21 Jul 2025 13:07:39 +0200
From: Jan Kara <jack@...e.cz>
To: Baokun Li <libaokun1@...wei.com>
Cc: linux-ext4@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca, 
	jack@...e.cz, linux-kernel@...r.kernel.org, ojaswin@...ux.ibm.com, 
	julia.lawall@...ia.fr, yi.zhang@...wei.com, yangerkun@...wei.com, 
	libaokun@...weicloud.com
Subject: Re: [PATCH v3 15/17] ext4: convert free groups order lists to xarrays

On Mon 14-07-25 21:03:25, Baokun Li wrote:
> While traversing the list, holding a spin_lock prevents load_buddy, making
> direct use of ext4_try_lock_group impossible. This can lead to a bouncing
> scenario where spin_is_locked(grp_A) succeeds, but ext4_try_lock_group()
> fails, forcing the list traversal to repeatedly restart from grp_A.
> 
> In contrast, linear traversal directly uses ext4_try_lock_group(),
> avoiding this bouncing. Therefore, we need a lockless, ordered traversal
> to achieve linear-like efficiency.
> 
> Therefore, this commit converts both average fragment size lists and
> largest free order lists into ordered xarrays.
> 
> In an xarray, the index represents the block group number and the value
> holds the block group information; a non-empty value indicates the block
> group's presence.
> 
> While insertion and deletion complexity remain O(1), lookup complexity
> changes from O(1) to O(nlogn), which may slightly reduce single-threaded
> performance.
> 
> Additionally, xarray insertions might fail, potentially due to memory
> allocation issues. However, since we have linear traversal as a fallback,
> this isn't a major problem. Therefore, we've only added a warning message
> for insertion failures here.
> 
> A helper function ext4_mb_find_good_group_xarray() is added to find good
> groups in the specified xarray starting at the specified position start,
> and when it reaches ngroups-1, it wraps around to 0 and then to start-1.
> This ensures an ordered traversal within the xarray.
> 
> Performance test results are as follows: Single-process operations
> on an empty disk show negligible impact, while multi-process workloads
> demonstrate a noticeable performance gain.
> 
> |CPU: Kunpeng 920   |          P80           |            P1           |
> |Memory: 512GB      |------------------------|-------------------------|
> |960GB SSD (0.5GB/s)| base  |    patched     | base   |    patched     |
> |-------------------|-------|----------------|--------|----------------|
> |mb_optimize_scan=0 | 20097 | 19555 (-2.6%)  | 316141 | 315636 (-0.2%) |
> |mb_optimize_scan=1 | 13318 | 15496 (+16.3%) | 325273 | 323569 (-0.5%) |
> 
> |CPU: AMD 9654 * 2  |          P96           |             P1          |
> |Memory: 1536GB     |------------------------|-------------------------|
> |960GB SSD (1GB/s)  | base  |    patched     | base   |    patched     |
> |-------------------|-------|----------------|--------|----------------|
> |mb_optimize_scan=0 | 53603 | 53192 (-0.7%)  | 214243 | 212678 (-0.7%) |
> |mb_optimize_scan=1 | 20887 | 37636 (+80.1%) | 213632 | 214189 (+0.2%) |
> 
> Signed-off-by: Baokun Li <libaokun1@...wei.com>

The patch looks good and the results are nice. I've just noticed two typos:

> +static inline void ext4_mb_avg_fragment_size_destory(struct ext4_sb_info *sbi)
						^^^ destroy


> +{
> +	for (int i = 0; i < MB_NUM_ORDERS(sbi->s_sb); i++)
> +		xa_destroy(&sbi->s_mb_avg_fragment_size[i]);
> +	kfree(sbi->s_mb_avg_fragment_size);
> +}
> +
> +static inline void ext4_mb_largest_free_orders_destory(struct ext4_sb_info *sbi)
						  ^^^ destroy

> +{
> +	for (int i = 0; i < MB_NUM_ORDERS(sbi->s_sb); i++)
> +		xa_destroy(&sbi->s_mb_largest_free_orders[i]);
> +	kfree(sbi->s_mb_largest_free_orders);
> +}

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR