lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250401015156.2623-1-yohan.joung@sk.com>
Date: Tue,  1 Apr 2025 10:51:52 +0900
From: "yohan.joung" <yohan.joung@...com>
To: linux-f2fs-devel@...ts.sourceforge.net
Cc: chao@...nel.org,
	jaegeuk@...nel.org,
	jyh429@...il.com,
	linux-kernel@...r.kernel.org,
	pilhyun.kim@...com,
	yohan.joung@...com
Subject: [External Mail] Re: [f2fs-dev] [External Mail] Re: [External Mail] Re: [PATCH] f2fs: prevent the current section from being selected as a victim during garbage collection

>From: Chao Yu <chao@...nel.org>
>Sent: Monday, March 31, 2025 8:36 PM
>To: 정요한(JOUNG YOHAN) Mobile AE <yohan.joung@...com>; linux-f2fs-
>devel@...ts.sourceforge.net
>Cc: chao@...nel.org; jaegeuk@...nel.org; jyh429@...il.com; linux-
>kernel@...r.kernel.org; 김필현(KIM PILHYUN) Mobile AE <pilhyun.kim@...com>
>Subject: [External Mail] Re: [External Mail] Re: [f2fs-dev] [External Mail]
>Re: [External Mail] Re: [PATCH] f2fs: prevent the current section from
>being selected as a victim during garbage collection
>
>On 3/31/25 13:13, yohan.joung wrote:
>>> On 2025/3/28 15:25, yohan.joung wrote:
>>>>> On 2025/3/28 11:40, yohan.joung wrote:
>>>>>>> From: Chao Yu <chao@...nel.org>
>>>>>>> Sent: Thursday, March 27, 2025 10:48 PM
>>>>>>> To: 정요한(JOUNG YOHAN) Mobile AE <yohan.joung@...com>; Yohan Joung
>>>>>>> <jyh429@...il.com>; jaegeuk@...nel.org; daeho43@...il.com
>>>>>>> Cc: chao@...nel.org; linux-f2fs-devel@...ts.sourceforge.net;
>>>>>>> linux- kernel@...r.kernel.org; 김필현(KIM PILHYUN) Mobile AE
>>>>>>> <pilhyun.kim@...com>
>>>>>>> Subject: [External Mail] Re: [External Mail] Re: [External Mail] Re:
>>>>>>> [PATCH] f2fs: prevent the current section from being selected as
>>>>>>> a victim during garbage collection
>>>>>>>
>>>>>>> On 2025/3/27 16:00, yohan.joung@...com wrote:
>>>>>>>>> From: Chao Yu <chao@...nel.org>
>>>>>>>>> Sent: Thursday, March 27, 2025 4:30 PM
>>>>>>>>> To: 정요한(JOUNG YOHAN) Mobile AE <yohan.joung@...com>; Yohan
>>>>>>>>> Joung <jyh429@...il.com>; jaegeuk@...nel.org; daeho43@...il.com
>>>>>>>>> Cc: chao@...nel.org; linux-f2fs-devel@...ts.sourceforge.net;
>>>>>>>>> linux- kernel@...r.kernel.org; 김필현(KIM PILHYUN) Mobile AE
>>>>>>>>> <pilhyun.kim@...com>
>>>>>>>>> Subject: [External Mail] Re: [External Mail] Re: [PATCH] f2fs:
>>>>>>>>> prevent the current section from being selected as a victim
>>>>>>>>> during garbage collection
>>>>>>>>>
>>>>>>>>> On 3/27/25 14:43, yohan.joung@...com wrote:
>>>>>>>>>>> From: Chao Yu <chao@...nel.org>
>>>>>>>>>>> Sent: Thursday, March 27, 2025 3:02 PM
>>>>>>>>>>> To: Yohan Joung <jyh429@...il.com>; jaegeuk@...nel.org;
>>>>>>>>>>> daeho43@...il.com
>>>>>>>>>>> Cc: chao@...nel.org; linux-f2fs-devel@...ts.sourceforge.net;
>>>>>>>>>>> linux- kernel@...r.kernel.org; 정요한(JOUNG YOHAN) Mobile AE
>>>>>>>>>>> <yohan.joung@...com>
>>>>>>>>>>> Subject: [External Mail] Re: [PATCH] f2fs: prevent the
>>>>>>>>>>> current section from being selected as a victim during
>>>>>>>>>>> garbage collection
>>>>>>>>>>>
>>>>>>>>>>> On 3/26/25 22:14, Yohan Joung wrote:
>>>>>>>>>>>> When selecting a victim using next_victim_seg in a large
>>>>>>>>>>>> section, the selected section might already have been
>>>>>>>>>>>> cleared and designated as the new current section, making it
>>>>>>>>>>>> actively in
>>>>> use.
>>>>>>>>>>>> This behavior causes inconsistency between the SIT and SSA.
>>>>>>>>>>>
>>>>>>>>>>> Hi, does this fix your issue?
>>>>>>>>>>
>>>>>>>>>> This is an issue that arises when dividing a large section
>>>>>>>>>> into segments for garbage collection.
>>>>>>>>>> caused by the background GC (garbage collection) thread in
>>>>>>>>>> large section
>>>>>>>>>> f2fs_gc(victim_section) ->
>>>>>>>>>> f2fs_clear_prefree_segments(victim_section)->
>>>>>>>>>> cursec(victim_section) -> f2fs_gc(victim_section by
>>>>>>>>>> next_victim_seg)
>>>>>>>>>
>>>>>>>>> I didn't get it, why f2fs_get_victim() will return section
>>>>>>>>> which is used by curseg? It should be avoided by checking w/
>>> sec_usage_check().
>>>>>>>>>
>>>>>>>>> Or we missed to check gcing section which next_victim_seg
>>>>>>>>> points to during get_new_segment()?
>>>>>>>>>
>>>>>>>>> Can this happen?
>>>>>>>>>
>>>>>>>>> e.g.
>>>>>>>>> - bggc selects sec #0
>>>>>>>>> - next_victim_seg: seg #0
>>>>>>>>> - migrate seg #0 and stop
>>>>>>>>> - next_victim_seg: seg #1
>>>>>>>>> - checkpoint, set sec #0 free if sec #0 has no valid blocks
>>>>>>>>> - allocate seg #0 in sec #0 for curseg
>>>>>>>>> - curseg moves to seg #1 after allocation
>>>>>>>>> - bggc tries to migrate seg #1
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>> That's correct
>>>>>>>> In f2fs_get_victim, we use next_victim_seg to directly jump to
>>>>>>>> got_result, thereby bypassing sec_usage_check What do you think
>>>>>>>> about this change?
>>>>>>>>
>>>>>>>> @@ -850,15 +850,20 @@ int f2fs_get_victim(struct f2fs_sb_info
>>>>>>>> *sbi,
>>>>>>> unsigned int *result,
>>>>>>>>                            p.min_segno = sbi->next_victim_seg[BG_GC];
>>>>>>>>                            *result = p.min_segno;
>>>>>>>>                            sbi->next_victim_seg[BG_GC] = NULL_SEGNO;
>>>>>>>> -                       goto got_result;
>>>>>>>>                    }
>>>>>>>>                    if (gc_type == FG_GC &&
>>>>>>>>                                    sbi->next_victim_seg[FG_GC]
>>>>>>>> != NULL_SEGNO)
>>> {
>>>>>>>>                            p.min_segno = sbi->next_victim_seg[FG_GC];
>>>>>>>>                            *result = p.min_segno;
>>>>>>>>                            sbi->next_victim_seg[FG_GC] = NULL_SEGNO;
>>>>>>>> -                       goto got_result;
>>>>>>>>                    }
>>>>>>>> +
>>>>>>>> +               secno = GET_SEC_FROM_SEG(sbi, segno);
>>>>>>>> +
>>>>>>>> +               if (sec_usage_check(sbi, secno))
>>>>>>>> +                       goto next;
>>>>>>>> +
>>>>>>>> +               goto got_result;
>>>>>>>>            }
>>>>>>>
>>>>>>> But still allocator can assign this segment after
>>>>>>> sec_usage_check() in race condition, right?
>>>>>> Since the BG GC using next_victim  takes place after the SIT
>>>>>> update in do_checkpoint, it seems unlikely that a race condition
>>>>>> with
>>>>> sec_usage_check will occur.
>>>>>
>>>>> I mean this:
>>>>>
>>>>> - gc_thread
>>>>>    - f2fs_gc
>>>>>     - f2fs_get_victim
>>>>>      - sec_usage_check --- segno #1 is not used in any cursegs
>>>>> 					- f2fs_allocate_data_block
>>>>> 					 - new_curseg
>>>>> 					  - get_new_segment find segno #1
>>>>>
>>>>>     - do_garbage_collect
>>>>>
>>>>> Thanks,
>>>>
>>>> 						  do_checkpoint sec0 free
>>>> 						  If sec0 is not freed, then
>>> segno1 within sec0 cannot be
>>>> allocated
>>>> - gc_thread
>>>>    - f2fs_gc
>>>>     - f2fs_get_victim
>>>>      - sec_usage_check  --- segno #1 is not used in any cursegs (but
>>>> sec0
>>> is already used)
>>>> 							- f2fs_allocate_data_block
>>>> 							- new_curseg
>>>> 							- get_new_segment find
>>> segno #1
>>>>
>>>>     - do_garbage_collect
>>>>
>>>> I appreciate your patch, it is under testing.
>>>> but I'm wondering if there's a risk of a race condition in this
>>>> situation
>>>
>>> Oh, yes, I may missed that get_new_segment can return a free segment
>>> in partial used section.
>>>
>>> So what do you think of this?
>>> - check CURSEG() in do_garbage_collect() and get_victim()
>>> - reset next_victim_seg[] in get_new_segment() and
>>> __set_test_and_free() during checkpoint.
>>>
>>> Thanks,
>>
>> How about using victim_secmap?
>> gc_thread
>> 				mutex_lock(&DIRTY_I(sbi)->seglist_lock);
>> 				__set_test_and_free
>> 				check cur section next_victim clear
>> 				mutex_unlock(&dirty_i->seglist_lock);
>>
>> mutex_lock(&dirty->seglist_lock);
>> f2fs_get_victim
>> mutex_unlock(&dirty_i->seglist_lock);
>>
>> static inline void __set_test_and_free(struct f2fs_sb_info *sbi,
>>                 if (next >= start_segno + usable_segs) {
>>                         if (test_and_clear_bit(secno, free_i->free_secmap))
>>                                 free_i->free_sections++;
>> +
>> +                       if (test_and_clear_bit(secno, dirty_i->victim_secmap))
>> +                               sbi->next_victim_seg[BG_GC] =
>> + NULL_SEGNO;
>
>Can this happen?
>
>segs_per_sec=2
>
>- seg#0 and seg#1 are all dirty
>- all valid blocks are removed in seg#1
>- checkpoint -> seg#1 becomes free
>- gc select this sec and next_victim_seg=seg#0
>- migrate seg#0, next_victim_seg=seg#1
>- allocator assigns seg#1 to curseg
>- gc tries to migrate seg#1
>
>Thanks,
The detailed scenario
segs_per_sec=2
- seg#0 and seg#1 are all dirty
- all valid blocks are removed in seg#1
- gc select this sec and next_victim_seg=seg#0
- migrate seg#0, next_victim_seg=seg#1
- checkpoint -> sec(seg#0, seg#1)  becomes free
- allocator assigns sec(seg#0, seg#1) to curseg
- gc tries to migrate seg#1
>
>>                 }
>>         }
>>>
>>>>
>>>>
>>>>>
>>>>>>>
>>>>>>> IMO, we can clear next_victim_seg[] once section is free in
>>>>>>> __set_test_and_free()? something like this:
>>>>>> I will test it according to your suggestion.
>>>>>> If there are no issues, can I submit it again with the patch?
>>>>>> Thanks
>>>>>>>
>>>>>>> ---
>>>>>>>     fs/f2fs/segment.h | 13 ++++++++++---
>>>>>>>     1 file changed, 10 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index
>>>>>>> 0465dc00b349..826e37999085 100644
>>>>>>> --- a/fs/f2fs/segment.h
>>>>>>> +++ b/fs/f2fs/segment.h
>>>>>>> @@ -473,9 +473,16 @@ static inline void
>>>>>>> __set_test_and_free(struct f2fs_sb_info *sbi,
>>>>>>>     			goto skip_free;
>>>>>>>     		next = find_next_bit(free_i->free_segmap,
>>>>>>>     				start_segno + SEGS_PER_SEC(sbi),
>>> start_segno);
>>>>>>> -		if (next >= start_segno + usable_segs) {
>>>>>>> -			if (test_and_clear_bit(secno, free_i-
>>free_secmap))
>>>>>>> -				free_i->free_sections++;
>>>>>>> +		if ((next >= start_segno + usable_segs) &&
>>>>>>> +			test_and_clear_bit(secno, free_i->free_secmap))
>{
>>>>>>> +			free_i->free_sections++;
>>>>>>> +
>>>>>>> +			if (GET_SEC_FROM_SEG(sbi->next_victim_seg[BG_GC])
>==
>>>>>>> +									secno)
>>>>>>> +				sbi->next_victim_seg[BG_GC] = NULL_SEGNO;
>>>>>>> +			if (GET_SEC_FROM_SEG(sbi->next_victim_seg[FG_GC])
>==
>>>>>>> +									secno)
>>>>>>> +				sbi->next_victim_seg[FG_GC] = NULL_SEGNO;
>>>>>>>     		}
>>>>>>>     	}
>>>>>>>     skip_free:
>>>>>>> --
>>>>>>> 2.40.1
>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Because the call stack is different, I think that in order to
>>>>>>>>>> handle everything at once, we need to address it within
>>>>>>>>>> do_garbage_collect, or otherwise include it on both sides.
>>>>>>>>>> What do you think?
>>>>>>>>>>
>>>>>>>>>> [30146.337471][ T1300] F2FS-fs (dm-54): Inconsistent segment
>>>>>>>>>> (70961) type [0, 1] in SSA and SIT [30146.346151][ T1300] Call
>>> trace:
>>>>>>>>>> [30146.346152][ T1300]  dump_backtrace+0xe8/0x10c
>>>>>>>>>> [30146.346157][ T1300]  show_stack+0x18/0x28 [30146.346158][
>>>>>>>>>> T1300] dump_stack_lvl+0x50/0x6c [30146.346161][ T1300]
>>>>>>>>>> dump_stack+0x18/0x28 [30146.346162][ T1300]
>>>>>>>>>> f2fs_stop_checkpoint+0x1c/0x3c [30146.346165][ T1300]
>>>>>>>>>> do_garbage_collect+0x41c/0x271c [30146.346167][ T1300]
>>>>>>>>>> f2fs_gc+0x27c/0x828 [30146.346168][ T1300]
>>>>>>>>>> gc_thread_func+0x290/0x88c [30146.346169][ T1300]
>>>>>>>>>> kthread+0x11c/0x164 [30146.346172][ T1300]
>>>>>>>>>> ret_from_fork+0x10/0x20
>>>>>>>>>>
>>>>>>>>>> struct curseg_info : 0xffffff803f95e800 {
>>>>>>>>>> 	segno        : 0x11531 : 70961
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> struct f2fs_sb_info : 0xffffff8811d12000 {
>>>>>>>>>> 	next_victim_seg[0] : 0x11531 : 70961 }
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> https://lore.kernel.org/linux-f2fs-devel/20250325080646.32919
>>>>>>>>>>> 47
>>>>>>>>>>> -2
>>>>>>>>>>> -
>>>>>>>>>>> chao@...nel.org
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Yohan Joung <yohan.joung@...com>
>>>>>>>>>>>> ---
>>>>>>>>>>>>     fs/f2fs/gc.c | 4 ++++
>>>>>>>>>>>>     1 file changed, 4 insertions(+)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index
>>>>>>>>>>>> 2b8f9239bede..4b5d18e395eb 100644
>>>>>>>>>>>> --- a/fs/f2fs/gc.c
>>>>>>>>>>>> +++ b/fs/f2fs/gc.c
>>>>>>>>>>>> @@ -1926,6 +1926,10 @@ int f2fs_gc(struct f2fs_sb_info *sbi,
>>>>>>>>>>>> struct
>>>>>>>>>>> f2fs_gc_control *gc_control)
>>>>>>>>>>>>     		goto stop;
>>>>>>>>>>>>     	}
>>>>>>>>>>>>
>>>>>>>>>>>> +	if (__is_large_section(sbi) &&
>>>>>>>>>>>> +			IS_CURSEC(sbi, GET_SEC_FROM_SEG(sbi, segno)))
>>>>>>>>>>>> +		goto stop;
>>>>>>>>>>>> +
>>>>>>>>>>>>     	seg_freed = do_garbage_collect(sbi, segno, &gc_list,
>gc_type,
>>>>>>>>>>>>     				gc_control->should_migrate_blocks,
>>>>>>>>>>>>     				gc_control->one_time);
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ