[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1460976397-5688-61-git-send-email-lizf@kernel.org>
Date: Mon, 18 Apr 2016 18:46:06 +0800
From: lizf@...nel.org
To: stable@...r.kernel.org
Cc: linux-kernel@...r.kernel.org, Joseph Qi <joseph.qi@...wei.com>,
Joel Becker <jlbec@...lplan.org>,
Mark Fasheh <mfasheh@...e.com>,
"Junxiao Bi" <junxiao.bi@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Zefan Li <lizefan@...wei.com>
Subject: [PATCH 3.4 61/92] ocfs2/dlm: fix deadlock when dispatch assert master
From: Joseph Qi <joseph.qi@...wei.com>
3.4.112-rc1 review patch. If anyone has any objections, please let me know.
------------------
commit 012572d4fc2e4ddd5c8ec8614d51414ec6cae02a upstream.
The order of the following three spinlocks should be:
dlm_domain_lock < dlm_ctxt->spinlock < dlm_lock_resource->spinlock
But dlm_dispatch_assert_master() is called while holding
dlm_ctxt->spinlock and dlm_lock_resource->spinlock, and then it calls
dlm_grab() which will take dlm_domain_lock.
Once another thread (for example, dlm_query_join_handler) has already
taken dlm_domain_lock, and tries to take dlm_ctxt->spinlock deadlock
happens.
Signed-off-by: Joseph Qi <joseph.qi@...wei.com>
Cc: Joel Becker <jlbec@...lplan.org>
Cc: Mark Fasheh <mfasheh@...e.com>
Cc: "Junxiao Bi" <junxiao.bi@...cle.com>
Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lizefan@...wei.com>
---
fs/ocfs2/dlm/dlmmaster.c | 4 +++-
fs/ocfs2/dlm/dlmrecovery.c | 6 +++++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
index 7ba6ac1..751efa8 100644
--- a/fs/ocfs2/dlm/dlmmaster.c
+++ b/fs/ocfs2/dlm/dlmmaster.c
@@ -1411,6 +1411,7 @@ int dlm_master_request_handler(struct o2net_msg *msg, u32 len, void *data,
int found, ret;
int set_maybe;
int dispatch_assert = 0;
+ int dispatched = 0;
if (!dlm_grab(dlm))
return DLM_MASTER_RESP_NO;
@@ -1617,6 +1618,8 @@ send_response:
mlog(ML_ERROR, "failed to dispatch assert master work\n");
response = DLM_MASTER_RESP_ERROR;
dlm_lockres_put(res);
+ } else {
+ dispatched = 1;
}
} else {
if (res)
@@ -2041,7 +2044,6 @@ int dlm_dispatch_assert_master(struct dlm_ctxt *dlm,
/* queue up work for dlm_assert_master_worker */
- dlm_grab(dlm); /* get an extra ref for the work item */
dlm_init_work_item(dlm, item, dlm_assert_master_worker, NULL);
item->u.am.lockres = res; /* already have a ref */
/* can optionally ignore node numbers higher than this node */
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index d15b071..0e5013e 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -1689,6 +1689,7 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data,
unsigned int hash;
int master = DLM_LOCK_RES_OWNER_UNKNOWN;
u32 flags = DLM_ASSERT_MASTER_REQUERY;
+ int dispatched = 0;
if (!dlm_grab(dlm)) {
/* since the domain has gone away on this
@@ -1710,6 +1711,8 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data,
mlog_errno(-ENOMEM);
/* retry!? */
BUG();
+ } else {
+ dispatched = 1;
}
} else /* put.. incase we are not the master */
dlm_lockres_put(res);
@@ -1717,7 +1720,8 @@ int dlm_master_requery_handler(struct o2net_msg *msg, u32 len, void *data,
}
spin_unlock(&dlm->spinlock);
- dlm_put(dlm);
+ if (!dispatched)
+ dlm_put(dlm);
return master;
}
--
1.9.1
Powered by blists - more mailing lists