lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20240104152717.rj7mmmij77q3mbiu@quack3> Date: Thu, 4 Jan 2024 16:27:17 +0100 From: Jan Kara <jack@...e.cz> To: Ojaswin Mujoo <ojaswin@...ux.ibm.com> Cc: linux-ext4@...r.kernel.org, Theodore Ts'o <tytso@....edu>, Ritesh Harjani <ritesh.list@...il.com>, linux-kernel@...r.kernel.org, Jan Kara <jack@...e.cz>, glandvador@...oo.com, bugzilla@...l.emu.id.au Subject: Re: [PATCH 1/1] ext4: fallback to complex scan if aligned scan doesn't work On Fri 15-12-23 16:49:50, Ojaswin Mujoo wrote: > Currently in case the goal length is a multiple of stripe size we use > ext4_mb_scan_aligned() to find the stripe size aligned physical blocks. > In case we are not able to find any, we again go back to calling > ext4_mb_choose_next_group() to search for a different suitable block > group. However, since the linear search always begins from the start, > most of the times we end up with the same BG and the cycle continues. > > With large fliesystems, the CPU can be stuck in this loop for hours > which can slow down the whole system. Hence, until we figure out a > better way to continue the search (rather than starting from beginning) > in ext4_mb_choose_next_group(), lets just fallback to > ext4_mb_complex_scan_group() in case aligned scan fails, as it is much > more likely to find the needed blocks. > > Signed-off-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com> If I understand the difference right, the problem is that while ext4_mb_choose_next_group() guarantees large enough free space extent for the CR_GOAL_LEN_FAST or CR_BEST_AVAIL_LEN passes, it does not guaranteed large enough *aligned* free space extent. Thus for non-aligned allocations we can fail only due to a race with another allocating process but with aligned allocations we can consistently fail in ext4_mb_scan_aligned() and thus livelock in the allocation loop. If my understanding is correct, feel free to add: Reviewed-by: Jan Kara <jack@...e.cz> Honza > --- > fs/ext4/mballoc.c | 21 +++++++++++++-------- > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c > index d72b5e3c92ec..63f12ec02485 100644 > --- a/fs/ext4/mballoc.c > +++ b/fs/ext4/mballoc.c > @@ -2895,14 +2895,19 @@ ext4_mb_regular_allocator(struct ext4_allocation_context *ac) > ac->ac_groups_scanned++; > if (cr == CR_POWER2_ALIGNED) > ext4_mb_simple_scan_group(ac, &e4b); > - else if ((cr == CR_GOAL_LEN_FAST || > - cr == CR_BEST_AVAIL_LEN) && > - sbi->s_stripe && > - !(ac->ac_g_ex.fe_len % > - EXT4_B2C(sbi, sbi->s_stripe))) > - ext4_mb_scan_aligned(ac, &e4b); > - else > - ext4_mb_complex_scan_group(ac, &e4b); > + else { > + bool is_stripe_aligned = sbi->s_stripe && > + !(ac->ac_g_ex.fe_len % > + EXT4_B2C(sbi, sbi->s_stripe)); > + > + if ((cr == CR_GOAL_LEN_FAST || > + cr == CR_BEST_AVAIL_LEN) && > + is_stripe_aligned) > + ext4_mb_scan_aligned(ac, &e4b); > + > + if (ac->ac_status == AC_STATUS_CONTINUE) > + ext4_mb_complex_scan_group(ac, &e4b); > + } > > ext4_unlock_group(sb, group); > ext4_mb_unload_buddy(&e4b); > -- > 2.39.3 > -- Jan Kara <jack@...e.com> SUSE Labs, CR
Powered by blists - more mailing lists