lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <1524726876-4770-1-git-send-email-ghe@suse.com>
Date:   Thu, 26 Apr 2018 15:14:36 +0800
From:   Gang He <ghe@...e.com>
To:     teigland@...hat.com
Cc:     Gang He <ghe@...e.com>, cluster-devel@...hat.com,
        linux-kernel@...r.kernel.org
Subject: [PATCH] dlm: make sctp_connect_to_sock() return in specified time

When the user setup a two-ring cluster, DLM kernel module
will automatically selects to use SCTP protocol to communicate
between each node. There will be about 5 minute hang in DLM
kernel module, in case one ring is broken before switching to
another ring, this will potentially affect the dependent upper
applications, e.g. ocfs2, gfs2, clvm and clustered-MD, etc.
Unfortunately, if the user setup a two-ring cluster, we can not
specify DLM communication protocol with TCP explicitly, since
DLM kernel module only supports SCTP protocol for multiple
ring cluster.
Base on my investigation, the time is spent in sock->ops->connect()
function before returns ETIMEDOUT(-110) error, since O_NONBLOCK
argument in connect() function does not work here, then we should
make sock->ops->connect() function return in specified time via
setting socket SO_SNDTIMEO atrribute.

Signed-off-by: Gang He <ghe@...e.com>
---
 fs/dlm/lowcomms.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 5243989..b786acc 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1037,6 +1037,7 @@ static void sctp_connect_to_sock(struct connection *con)
 	int result;
 	int addr_len;
 	struct socket *sock;
+	struct timeval tv = { .tv_sec = 5, .tv_usec = 0 };
 
 	if (con->nodeid == 0) {
 		log_print("attempt to connect sock 0 foiled");
@@ -1083,8 +1084,19 @@ static void sctp_connect_to_sock(struct connection *con)
 	kernel_setsockopt(sock, SOL_TCP, TCP_NODELAY, (char *)&one,
 			  sizeof(one));
 
+	/*
+	 * Make sock->ops->connect() function return in specified time,
+	 * since O_NONBLOCK argument in connect() function does not work here,
+	 * then, we should restore the default value of this attribute.
+	 */
+	kernel_setsockopt(sock, SOL_SOCKET, SO_SNDTIMEO, (char *)&tv,
+			  sizeof(tv));
 	result = sock->ops->connect(sock, (struct sockaddr *)&daddr, addr_len,
 				   O_NONBLOCK);
+	memset(&tv, 0, sizeof(tv));
+	kernel_setsockopt(sock, SOL_SOCKET, SO_SNDTIMEO, (char *)&tv,
+			  sizeof(tv));
+
 	if (result == -EINPROGRESS)
 		result = 0;
 	if (result == 0)
-- 
1.8.5.6

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ