[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210111130054.514634756@linuxfoundation.org>
Date: Mon, 11 Jan 2021 14:02:29 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-kernel@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
stable@...r.kernel.org, Filipe Manana <fdmanana@...e.com>,
Qu Wenruo <wqu@...e.com>, David Sterba <dsterba@...e.com>
Subject: [PATCH 5.10 125/145] btrfs: qgroup: dont try to wait flushing if were already holding a transaction
From: Qu Wenruo <wqu@...e.com>
commit ae5e070eaca9dbebde3459dd8f4c2756f8c097d0 upstream.
There is a chance of racing for qgroup flushing which may lead to
deadlock:
Thread A | Thread B
(not holding trans handle) | (holding a trans handle)
--------------------------------+--------------------------------
__btrfs_qgroup_reserve_meta() | __btrfs_qgroup_reserve_meta()
|- try_flush_qgroup() | |- try_flush_qgroup()
|- QGROUP_FLUSHING bit set | |
| | |- test_and_set_bit()
| | |- wait_event()
|- btrfs_join_transaction() |
|- btrfs_commit_transaction()|
!!! DEAD LOCK !!!
Since thread A wants to commit transaction, but thread B is holding a
transaction handle, blocking the commit.
At the same time, thread B is waiting for thread A to finish its commit.
This is just a hot fix, and would lead to more EDQUOT when we're near
the qgroup limit.
The proper fix would be to make all metadata/data reservations happen
without holding a transaction handle.
CC: stable@...r.kernel.org # 5.9+
Reviewed-by: Filipe Manana <fdmanana@...e.com>
Signed-off-by: Qu Wenruo <wqu@...e.com>
Signed-off-by: David Sterba <dsterba@...e.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
---
fs/btrfs/qgroup.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -3565,16 +3565,6 @@ static int try_flush_qgroup(struct btrfs
bool can_commit = true;
/*
- * We don't want to run flush again and again, so if there is a running
- * one, we won't try to start a new flush, but exit directly.
- */
- if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
- wait_event(root->qgroup_flush_wait,
- !test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
- return 0;
- }
-
- /*
* If current process holds a transaction, we shouldn't flush, as we
* assume all space reservation happens before a transaction handle is
* held.
@@ -3588,6 +3578,26 @@ static int try_flush_qgroup(struct btrfs
current->journal_info != BTRFS_SEND_TRANS_STUB)
can_commit = false;
+ /*
+ * We don't want to run flush again and again, so if there is a running
+ * one, we won't try to start a new flush, but exit directly.
+ */
+ if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
+ /*
+ * We are already holding a transaction, thus we can block other
+ * threads from flushing. So exit right now. This increases
+ * the chance of EDQUOT for heavy load and near limit cases.
+ * But we can argue that if we're already near limit, EDQUOT is
+ * unavoidable anyway.
+ */
+ if (!can_commit)
+ return 0;
+
+ wait_event(root->qgroup_flush_wait,
+ !test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
+ return 0;
+ }
+
ret = btrfs_start_delalloc_snapshot(root);
if (ret < 0)
goto out;
Powered by blists - more mailing lists