[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ff0e3bdf-d5e4-8709-6d6b-c8d36b57b828@kernel.org>
Date: Fri, 23 Jul 2021 08:39:03 +0800
From: Chao Yu <chao@...nel.org>
To: Jaegeuk Kim <jaegeuk@...nel.org>, linux-kernel@...r.kernel.org,
linux-f2fs-devel@...ts.sourceforge.net
Subject: Re: [f2fs-dev] [PATCH v2] f2fs: don't sleep while grabing
nat_tree_lock
On 2021/7/23 1:44, Jaegeuk Kim wrote:
> This tries to fix priority inversion in the below condition resulting in
> long checkpoint delay.
>
> f2fs_get_node_info()
> - nat_tree_lock
> -> sleep to grab journal_rwsem by contention
>
> checkpoint
> - waiting for nat_tree_lock
>
> In order to let checkpoint go, let's release nat_tree_lock, if there's a
> journal_rwsem contention.
Write lock of nat_tree_lock is held from many places, how about just
retrying unlock/lock only if checkpoint() is flushing nat blocks?
---
fs/f2fs/f2fs.h | 1 +
fs/f2fs/node.c | 8 ++++++--
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index ad8f99d7235f..05f41a15fda4 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -877,6 +877,7 @@ struct f2fs_nm_info {
spinlock_t nat_list_lock; /* protect clean nat entry list */
unsigned int nat_cnt[MAX_NAT_STATE]; /* the # of cached nat entries */
unsigned int nat_blocks; /* # of nat blocks */
+ bool flushing_nat; /* indicate checkpoint() is flushing nat blocks */
/* free node ids management */
struct radix_tree_root free_nid_root;/* root of the free_nid cache */
diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index c60ba4179bb2..2caa171a68f8 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -569,7 +569,7 @@ int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid,
* This sem is on the critical path on the checkpoint requiring the above
* nat_tree_lock. Therefore, we should retry, if we failed to grab here.
*/
- if (!down_read_trylock(&curseg->journal_rwsem)) {
+ if (!down_read_trylock(&curseg->journal_rwsem) && nm_i->flushing_nat) {
up_read(&nm_i->nat_tree_lock);
goto retry;
}
@@ -2981,6 +2981,8 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
LIST_HEAD(sets);
int err = 0;
+ nm_i->flushing_nat = true;
+
/*
* during unmount, let's flush nat_bits before checking
* nat_cnt[DIRTY_NAT].
@@ -2992,7 +2994,7 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
}
if (!nm_i->nat_cnt[DIRTY_NAT])
- return 0;
+ goto out;
down_write(&nm_i->nat_tree_lock);
@@ -3026,6 +3028,8 @@ int f2fs_flush_nat_entries(struct f2fs_sb_info *sbi, struct cp_control *cpc)
up_write(&nm_i->nat_tree_lock);
/* Allow dirty nats by node block allocation in write_begin */
+out:
+ nm_i->flushing_nat = false;
return err;
}
--
2.22.1
Powered by blists - more mailing lists