[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230616085608.42435-1-libaokun1@huawei.com>
Date: Fri, 16 Jun 2023 16:56:08 +0800
From: Baokun Li <libaokun1@...wei.com>
To: <jack@...e.cz>
CC: <linux-fsdevel@...r.kernel.org>, <linux-ext4@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <yi.zhang@...wei.com>,
<yangerkun@...wei.com>, <chengzhihao1@...wei.com>,
<yukuai3@...wei.com>, <libaokun1@...wei.com>
Subject: [PATCH] quota: fix race condition between dqput() and dquot_mark_dquot_dirty()
We ran into a problem that dqput() and dquot_mark_dquot_dirty() may race
like the function graph below, causing a released dquot to be added to the
dqi_dirty_list, and this leads to that dquot being released again in
dquot_writeback_dquots(), making two identical quotas in free_dquots.
cpu1 cpu2
_________________|_________________
wb_do_writeback CHOWN(1)
...
ext4_da_update_reserve_space
dquot_claim_block
...
dquot_mark_dquot_dirty // try to dirty old quota
test_bit(DQ_ACTIVE_B, &dquot->dq_flags) // still ACTIVE
if (test_bit(DQ_MOD_B, &dquot->dq_flags))
// test no dirty, wait dq_list_lock
...
dquot_transfer
__dquot_transfer
dqput_all(transfer_from) // rls old dquot
dqput // last dqput
dquot_release
clear_bit(DQ_ACTIVE_B, &dquot->dq_flags)
atomic_dec(&dquot->dq_count)
put_dquot_last(dquot)
list_add_tail(&dquot->dq_free, &free_dquots)
// first add the dquot to free_dquots
if (!test_and_set_bit(DQ_MOD_B, &dquot->dq_flags))
add dqi_dirty_list // add freed dquot to dirty_list
P3:
ksys_sync
...
dquot_writeback_dquots
WARN_ON(!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
dqgrab(dquot)
WARN_ON_ONCE(!atomic_read(&dquot->dq_count))
WARN_ON_ONCE(!test_bit(DQ_ACTIVE_B, &dquot->dq_flags))
dqput(dquot)
put_dquot_last(dquot)
list_add_tail(&dquot->dq_free, &free_dquots)
// Double add the dquot to free_dquots
This causes a list_del corruption when removing the entry from free_dquots,
and even trying to free the dquot twice in dqcache_shrink_scan triggers a
use-after-free.
A warning may also be triggered by a race like the function diagram below:
cpu1 cpu2 cpu3
________________|_______________|________________
wb_do_writeback CHOWN(1) QUOTASYNC(1)
... ...
ext4_da_update_reserve_space
... __dquot_transfer
dqput // last dqput
dquot_release
dquot_is_busy
if (test_bit(DQ_MOD_B, &dquot->dq_flags))
// not dirty and still active
dquot_mark_dquot_dirty
if (!test_and_set_bit(DQ_MOD_B, &dquot->dq_flags))
add dqi_dirty_list
clear_bit(DQ_ACTIVE_B, &dquot->dq_flags)
dquot_writeback_dquots
WARN_ON(!test_bit(DQ_ACTIVE_B))
To solve this problem, it is similar to the way dqget() avoids racing with
dquot_release(). First set the DQ_MOD_B flag, then execute wait_on_dquot(),
after this we know that either dquot_release() is already finished or it
will be canceled due to DQ_MOD_B flag test, at this point if the quota is
DQ_ACTIVE_B, then we can safely add the dquot to the dqi_dirty_list,
otherwise clear the DQ_MOD_B flag and exit directly.
Fixes: 4580b30ea887 ("quota: Do not dirty bad dquots")
Signed-off-by: Baokun Li <libaokun1@...wei.com>
---
Hello Honza,
This problem can also be solved by modifying the reference count mechanism,
where dquots hold a reference count after they are allocated until they are
destroyed, i.e. the dquots in the free_dquots list have dq_count == 1. This
allows us to reduce the reference count as soon as we enter the dqput(),
and then add the dquot to the dqi_dirty_list only when dq_count > 1. This
also prevents the dquot in the dqi_dirty_list from not having the
DQ_ACTIVE_B flag, but this is a more impactful modification, so we chose to
refer to dqget() to avoid racing with dquot_release(). If you prefer this
solution by modifying the dq_count mechanism, I would be happy to send
another version of the patch.
Thanks,
Baokun.
fs/quota/dquot.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index e3e4f4047657..2a04cd74c7c5 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -362,11 +362,26 @@ int dquot_mark_dquot_dirty(struct dquot *dquot)
return 1;
spin_lock(&dq_list_lock);
- if (!test_and_set_bit(DQ_MOD_B, &dquot->dq_flags)) {
+ ret = test_and_set_bit(DQ_MOD_B, &dquot->dq_flags);
+ if (ret)
+ goto out_lock;
+ spin_unlock(&dq_list_lock);
+
+ /*
+ * Wait for dq_lock - after this we know that either dquot_release() is
+ * already finished or it will be canceled due to DQ_MOD_B flag test.
+ */
+ wait_on_dquot(dquot);
+ spin_lock(&dq_list_lock);
+ if (!test_bit(DQ_ACTIVE_B, &dquot->dq_flags)) {
+ clear_bit(DQ_MOD_B, &dquot->dq_flags);
+ goto out_lock;
+ }
+ /* DQ_MOD_B is cleared means that the dquot has been written back */
+ if (test_bit(DQ_MOD_B, &dquot->dq_flags))
list_add(&dquot->dq_dirty, &sb_dqopt(dquot->dq_sb)->
info[dquot->dq_id.type].dqi_dirty_list);
- ret = 0;
- }
+out_lock:
spin_unlock(&dq_list_lock);
return ret;
}
@@ -791,7 +806,7 @@ void dqput(struct dquot *dquot)
return;
}
/* Need to release dquot? */
- if (dquot_dirty(dquot)) {
+ if (test_bit(DQ_ACTIVE_B, &dquot->dq_flags) && dquot_dirty(dquot)) {
spin_unlock(&dq_list_lock);
/* Commit dquot before releasing */
ret = dquot->dq_sb->dq_op->write_dquot(dquot);
--
2.31.1
Powered by blists - more mailing lists