[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <7a8b552a-d1e0-89e2-5f49-7b4fd2011c70@faulpeltz.net>
Date: Thu, 13 Oct 2016 23:26:59 +0200
From: Michael Gissing <mg@...lpeltz.net>
To: alexng@...rosoft.com
Cc: kys@...rosoft.com, linux-kernel@...r.kernel.org,
devel@...uxdriverproject.org, olaf@...fle.de, apw@...onical.com,
vkuznets@...hat.com, gregkh@...uxfoundation.org
Subject: [PATCH] Tools: hv: recover after hv_vss_daemon freeze times out
If a FIFREEZE operation run by the hv_vss_daemon takes longer than the
VSS_USERSPACE_TIMEOUT set in the hv_snapshot module, instead of exiting
after a write failure, try to recover by reopening the hv_vss device and
performing the initial handshake again. Exiting causes all subsequent VSS
operations sent by the Hyper-V host to fail until the daemon is restarted.
Signed-off-by: Michael Gissing <mg@...lpeltz.net>
---
tools/hv/hv_vss_daemon.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/tools/hv/hv_vss_daemon.c b/tools/hv/hv_vss_daemon.c
index 5d51d6f..0ecbdab 100644
--- a/tools/hv/hv_vss_daemon.c
+++ b/tools/hv/hv_vss_daemon.c
@@ -176,6 +176,7 @@ int main(int argc, char *argv[])
openlog("Hyper-V VSS", 0, LOG_USER);
syslog(LOG_INFO, "VSS starting; pid is:%d", getpid());
+recover:
vss_fd = open("/dev/vmbus/hv_vss", O_RDWR);
if (vss_fd < 0) {
syslog(LOG_ERR, "open /dev/vmbus/hv_vss failed; error: %d %s",
@@ -196,6 +197,7 @@ int main(int argc, char *argv[])
}
pfd.fd = vss_fd;
+ in_handshake = 1;
while (1) {
pfd.events = POLLIN;
@@ -258,7 +260,14 @@ int main(int argc, char *argv[])
if (len != sizeof(struct hv_vss_msg)) {
syslog(LOG_ERR, "write failed; error: %d %s", errno,
strerror(errno));
- exit(EXIT_FAILURE);
+ /*
+ * try to recover from possible timeout by THAWing
+ * and restarting the message loop
+ */
+ vss_operate(VSS_OP_THAW);
+ close(vss_fd);
+ syslog(LOG_INFO, "trying to recover VSS connection");
+ goto recover;
}
}
--
2.7.4
Powered by blists - more mailing lists