lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 10 Sep 2020 09:49:01 +0200
From:   Jan Kara <jack@...e.cz>
To:     Ye Bin <yebin10@...wei.com>
Cc:     tytso@....edu, adilger.kernel@...ger.ca, jack@...e.com,
        linux-ext4@...r.kernel.org
Subject: Re: [PATCH v2] ext4: Fix dead loop in ext4_mb_new_blocks


Hello!

On Thu 10-09-20 11:08:06, Ye Bin wrote:
> As we test disk offline/online with running fsstress, we find fsstress
> process is keeping running state.
> kworker/u32:3-262   [004] ...1   140.787471: ext4_mb_discard_preallocations: dev 8,32 needed 114
> ....
> kworker/u32:3-262   [004] ...1   140.787471: ext4_mb_discard_preallocations: dev 8,32 needed 114
> 
> ext4_mb_new_blocks
> repeat:
> 	ext4_mb_discard_preallocations_should_retry(sb, ac, &seq)
> 		freed = ext4_mb_discard_preallocations
> 			ext4_mb_discard_group_preallocations
> 				this_cpu_inc(discard_pa_seq);
> 		---> freed == 0
> 		seq_retry = ext4_get_discard_pa_seq_sum
> 			for_each_possible_cpu(__cpu)
> 				__seq += per_cpu(discard_pa_seq, __cpu);
> 		if (seq_retry != *seq) {
> 			*seq = seq_retry;
> 			ret = true;
> 		}
> 
> As we see seq_retry is sum of discard_pa_seq every cpu, if
> ext4_mb_discard_group_preallocations return zero discard_pa_seq in this
> cpu maybe increase one, so condition "seq_retry != *seq" have always
> been met.
> To Fix this problem, ext4_get_discard_pa_seq_sum function couldn't add
> own's cpu "discard_pa_seq" value.
> 
> Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling")
> Signed-off-by: Ye Bin <yebin10@...wei.com>

Thanks for the patch! I agree with your analysis but avoiding current CPU
in the sum is wrong. After all there's nothing which prevents the scheduler
from rescheduling your task among different CPUs while it is searching for
preallocations to discard. The correct fix is IMO to change
ext4_mb_discard_group_preallocations() so that it increments discard_pa_seq
only when it found preallocation to discard (and is thus guaranteed to
return value > 0).

								Honza

> ---
>  fs/ext4/mballoc.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index 132c118d12e1..f386fe62727d 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -372,11 +372,13 @@ static void ext4_mb_new_preallocation(struct ext4_allocation_context *ac);
>  static DEFINE_PER_CPU(u64, discard_pa_seq);
>  static inline u64 ext4_get_discard_pa_seq_sum(void)
>  {
> -	int __cpu;
> +	int __cpu, this_cpu;
>  	u64 __seq = 0;
>  
> +	this_cpu = smp_processor_id();
>  	for_each_possible_cpu(__cpu)
> -		__seq += per_cpu(discard_pa_seq, __cpu);
> +		if (this_cpu != __cpu)
> +			__seq += per_cpu(discard_pa_seq, __cpu);
>  	return __seq;
>  }
>  
> -- 
> 2.25.4
> 
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists