[<prev] [next>] [day] [month] [year] [list]
Message-Id: <1164360864.3392.141.camel@quoit.chygwyn.com>
Date: Fri, 24 Nov 2006 09:34:24 +0000
From: Steven Whitehouse <swhiteho@...hat.com>
To: linux-kernel@...r.kernel.org, cluster-devel@...hat.com
Cc: David Teigland <teigland@...hat.com>
Subject: [DLM] fix stopping unstarted recovery [5/9]
>>From 6db92a6b47fd3b9089de9f14e5143c459f86c78a Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@...hat.com>
Date: Tue, 31 Oct 2006 11:56:08 -0600
Subject: [PATCH] [DLM] fix stopping unstarted recovery
Red Hat BZ 211914
When many nodes are joining a lockspace simultaneously, the dlm gets a
quick sequence of stop/start events, a pair for adding each node.
dlm_controld in user space sends dlm_recoverd in the kernel each stop and
start event. dlm_controld will sometimes send the stop before
dlm_recoverd has had a chance to take up the previously queued start. The
stop aborts the processing of the previous start by setting the
RECOVERY_STOP flag. dlm_recoverd is erroneously clearing this flag and
ignoring the stop/abort if it happens to take up the start after the stop
meant to abort it. The fix is to check the sequence number that's
incremented for each stop/start before clearing the flag.
Signed-off-by: David Teigland <teigland@...hat.com>
Signed-off-by: Steven Whitehouse <swhiteho@...hat.com>
---
fs/dlm/recoverd.c | 7 ++++++-
1 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index 4a1d602..6e4ee94 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c
@@ -219,6 +219,10 @@ static int ls_recover(struct dlm_ls *ls,
return error;
}
+/* The dlm_ls_start() that created the rv we take here may already have been
+ stopped via dlm_ls_stop(); in that case we need to leave the RECOVERY_STOP
+ flag set. */
+
static void do_ls_recovery(struct dlm_ls *ls)
{
struct dlm_recover *rv = NULL;
@@ -226,7 +230,8 @@ static void do_ls_recovery(struct dlm_ls
spin_lock(&ls->ls_recover_lock);
rv = ls->ls_recover_args;
ls->ls_recover_args = NULL;
- clear_bit(LSFL_RECOVERY_STOP, &ls->ls_flags);
+ if (rv && ls->ls_recover_seq == rv->seq)
+ clear_bit(LSFL_RECOVERY_STOP, &ls->ls_flags);
spin_unlock(&ls->ls_recover_lock);
if (rv) {
--
1.4.1
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists