[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190731034159.GH8289@codeaurora.org>
Date: Wed, 31 Jul 2019 09:11:59 +0530
From: Sahitya Tummala <stummala@...eaurora.org>
To: Chao Yu <yuchao0@...wei.com>
Cc: Chao Yu <chao@...nel.org>, Jaegeuk Kim <jaegeuk@...nel.org>,
linux-f2fs-devel@...ts.sourceforge.net,
linux-kernel@...r.kernel.org, stummala@...eaurora.org
Subject: Re: [f2fs-dev] [PATCH] f2fs: Fix indefinite loop in f2fs_gc()
Hi Chao,
On Tue, Jul 30, 2019 at 08:35:46PM +0800, Chao Yu wrote:
> Hi Sahitya,
>
> On 2019/7/30 12:36, Sahitya Tummala wrote:
> > Hi Chao,
> >
> > On Tue, Jul 30, 2019 at 12:00:45AM +0800, Chao Yu wrote:
> >> Hi Sahitya,
> >>
> >> On 2019-7-29 13:20, Sahitya Tummala wrote:
> >>> Policy - foreground GC, LFS mode and greedy GC mode.
> >>>
> >>> Under this policy, f2fs_gc() loops forever to GC as it doesn't have
> >>> enough free segements to proceed and thus it keeps calling gc_more
> >>> for the same victim segment. This can happen if the selected victim
> >>> segment could not be GC'd due to failed blkaddr validity check i.e.
> >>> is_alive() returns false for the blocks set in current validity map.
> >>>
> >>> Fix this by not resetting the sbi->cur_victim_sec to NULL_SEGNO, when
> >>> the segment selected could not be GC'd. This helps to select another
> >>> segment for GC and thus helps to proceed forward with GC.
> >>>
> >>> Signed-off-by: Sahitya Tummala <stummala@...eaurora.org>
> >>> ---
> >>> fs/f2fs/gc.c | 2 +-
> >>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c
> >>> index 8974672..7bbcc4a 100644
> >>> --- a/fs/f2fs/gc.c
> >>> +++ b/fs/f2fs/gc.c
> >>> @@ -1303,7 +1303,7 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync,
> >>> round++;
> >>> }
> >>>
> >>> - if (gc_type == FG_GC)
> >>> + if (gc_type == FG_GC && seg_freed)
> >>> sbi->cur_victim_sec = NULL_SEGNO;
> >>
> >> In some cases, we may remain last victim in sbi->cur_victim_sec, and jump out of
> >> GC cycle, then SSR can skip the last victim due to sec_usage_check()...
> >>
> >
> > I see. I have a few questions on how to fix this issue. Please share your
> > comments.
> >
> > 1. Do you think the scenario described is valid? It happens rarely, not very
>
> IIRC, we suffered endless gc loop due to there is valid block belong to an
> opened atomic write file. (because we will skip directly once we hit atomic file)
>
> For your case, I'm not sure that would happen, did you look into is_alive(), why
> will it fail? block address not match? If so, it looks like summary info and
> dnode block and nat entry are inconsistent.
Yes, from the ramdumps, I could see that block address is not matching and
hence, is_alive() could fail in the issue scenario. Have you observed any such
cases before? What could be the reason for this mismatch?
Thanks,
>
> > easy to reproduce. From the dumps, I see that only block is set as valid in
> > the sentry->cur_valid_map for which I see that summary block check is_alive()
> > could return false. As only one block is set as valid, chances are there it
> > can be always selected as the victim by get_victim_by_default() under FG_GC.
> >
> > 2. What are the possible scenarios where summary block check is_alive() could
> > fail for a segment?
>
> I guess, maybe after check_valid_map(), the block is been truncated before
> is_alive(). If so the victim should be prefree directly instead of being
> selected again...
>
> >
> > 3. How does GC handle such segments?
>
> I think that's not a normal case, or I'm missing something.
>
> Thanks,
>
> >
> > Thanks,
> >
> >> Thanks,
> >>
> >>>
> >>> if (sync)
> >>>
> >
--
--
Sent by a consultant of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.
Powered by blists - more mailing lists