linux-kernel - [PATCH 48/60] staging: lustre: ksocklnd: ignore timedout TX on closing connection

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1485648328-2141-49-git-send-email-jsimmons@infradead.org>
Date:   Sat, 28 Jan 2017 19:05:16 -0500
From:   James Simmons <jsimmons@...radead.org>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        devel@...verdev.osuosl.org,
        Andreas Dilger <andreas.dilger@...el.com>,
        Oleg Drokin <oleg.drokin@...el.com>
Cc:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Lustre Development List <lustre-devel@...ts.lustre.org>,
        Liang Zhen <liang.zhen@...el.com>,
        James Simmons <jsimmons@...radead.org>
Subject: [PATCH 48/60] staging: lustre: ksocklnd: ignore timedout TX on closing connection

From: Liang Zhen <liang.zhen@...el.com>

ksocklnd reaper thread always tries to close the connection for the
first timedout zero-copy TX. This is wrong if this connection is
already being closed, because the reaper will see the same TX again
and again and cannot find out other timedout zero-copy TXs and close
connections for them.

Signed-off-by: Liang Zhen <liang.zhen@...el.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8867
Reviewed-on: https://review.whamcloud.com/23973
Reviewed-by: Doug Oucharek <doug.s.oucharek@...el.com>
Reviewed-by: Oleg Drokin <oleg.drokin@...el.com>
Signed-off-by: James Simmons <jsimmons@...radead.org>
---
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index df4f55e..b7043e2 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -2456,6 +2456,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 
 	list_for_each_entry(peer, peers, ksnp_list) {
 		unsigned long deadline = 0;
+		struct ksock_tx *tx_stale;
 		int resid = 0;
 		int n = 0;
 
@@ -2503,6 +2504,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 		if (list_empty(&peer->ksnp_zc_req_list))
 			continue;
 
+		tx_stale = NULL;
 		spin_lock(&peer->ksnp_lock);
 		list_for_each_entry(tx, &peer->ksnp_zc_req_list, tx_zc_list) {
 			if (!cfs_time_aftereq(cfs_time_current(),
@@ -2511,26 +2513,26 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 			/* ignore the TX if connection is being closed */
 			if (tx->tx_conn->ksnc_closing)
 				continue;
+			if (!tx_stale)
+				tx_stale = tx;
 			n++;
 		}
 
-		if (!n) {
+		if (!tx_stale) {
 			spin_unlock(&peer->ksnp_lock);
 			continue;
 		}
 
-		tx = list_entry(peer->ksnp_zc_req_list.next,
-				struct ksock_tx, tx_zc_list);
-		deadline = tx->tx_deadline;
-		resid = tx->tx_resid;
-		conn = tx->tx_conn;
+		deadline = tx_stale->tx_deadline;
+		resid = tx_stale->tx_resid;
+		conn = tx_stale->tx_conn;
 		ksocknal_conn_addref(conn);
 
 		spin_unlock(&peer->ksnp_lock);
 		read_unlock(&ksocknal_data.ksnd_global_lock);
 
 		CERROR("Total %d stale ZC_REQs for peer %s detected; the oldest(%p) timed out %ld secs ago, resid: %d, wmem: %d\n",
-		       n, libcfs_nid2str(peer->ksnp_id.nid), tx,
+		       n, libcfs_nid2str(peer->ksnp_id.nid), tx_stale,
 		       cfs_duration_sec(cfs_time_current() - deadline),
 		       resid, conn->ksnc_sock->sk->sk_wmem_queued);
 
-- 
1.8.3.1