[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20121122004049.899615915@linuxfoundation.org>
Date: Wed, 21 Nov 2012 16:41:51 -0800
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-kernel@...r.kernel.org, stable@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
alan@...rguk.ukuu.org.uk, Alex Elder <elder@...tank.com>,
Sage Weil <sage@...tank.com>
Subject: [ 165/171] rbd: reset BACKOFF if unable to re-queue
3.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Elder <elder@...tank.com>
(cherry picked from commit 588377d6199034c36d335e7df5818b731fea072c)
If ceph_fault() is unable to queue work after a delay, it sets the
BACKOFF connection flag so con_work() will attempt to do so.
In con_work(), when BACKOFF is set, if queue_delayed_work() doesn't
result in newly-queued work, it simply ignores this condition and
proceeds as if no backoff delay were desired. There are two
problems with this--one of which is a bug.
The first problem is simply that the intended behavior is to back
off, and if we aren't able queue the work item to run after a delay
we're not doing that.
The only reason queue_delayed_work() won't queue work is if the
provided work item is already queued. In the messenger, this
means that con_work() is already scheduled to be run again. So
if we simply set the BACKOFF flag again when this occurs, we know
the next con_work() call will again attempt to hold off activity
on the connection until after the delay.
The second problem--the bug--is a leak of a reference count. If
queue_delayed_work() returns 0 in con_work(), con->ops->put() drops
the connection reference held on entry to con_work(). However,
processing is (was) allowed to continue, and at the end of the
function a second con->ops->put() is called.
This patch fixes both problems.
Signed-off-by: Alex Elder <elder@...tank.com>
Reviewed-by: Sage Weil <sage@...tank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
---
net/ceph/messenger.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -2296,10 +2296,11 @@ restart:
mutex_unlock(&con->mutex);
return;
} else {
- con->ops->put(con);
dout("con_work %p FAILED to back off %lu\n", con,
con->delay);
+ set_bit(CON_FLAG_BACKOFF, &con->flags);
}
+ goto done;
}
if (con->state == CON_STATE_STANDBY) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists