lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140208133744.GA20512@glanzmann.de>
Date:	Sat, 8 Feb 2014 14:37:45 +0100
From:	Thomas Glanzmann <thomas@...nzmann.de>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	John Ogness <john.ogness@...utronix.de>,
	Eric Dumazet <edumazet@...gle.com>,
	"David S. Miller" <davem@...emloft.net>,
	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>,
	target-devel <target-devel@...r.kernel.org>,
	Linux Network Development <netdev@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: REGRESSION f54b311142a92ea2e42598e347b84e1655caf8e3 tcp auto
 corking slows down iSCSI file system creation by factor of 70 [WAS: 4 TB
 VMFS creation takes 15 minutes vs 26 seconds]

Hello Eric,

> > tcp corking kills iSCSI performance

> Here is the combined patch, could you test it?

the patch did not apply, so I edited by hand. Here is the resulting
patch:

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 03d26b8..40d1958 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -698,7 +698,8 @@ static void tcp_tsq_handler(struct sock *sk)
 	if ((1 << sk->sk_state) &
 	    (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_CLOSING |
 	     TCPF_CLOSE_WAIT  | TCPF_LAST_ACK))
-		tcp_write_xmit(sk, tcp_current_mss(sk), 0, 0, GFP_ATOMIC);
+			tcp_write_xmit(sk, tcp_current_mss(sk), tcp_sk(sk)->nonagle,
+	                               0, GFP_ATOMIC);
 }
 /*
  * One tasklet per cpu tries to send more skbs.
@@ -1904,7 +1905,16 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
 
 		if (atomic_read(&sk->sk_wmem_alloc) > limit) {
 			set_bit(TSQ_THROTTLED, &tp->tsq_flags);
-			break;
+			/* It is possible TX completion already happened
+			 * before we set TSQ_THROTTLED, so we must
+			 * test again the condition.
+			 * We abuse smp_mb__after_clear_bit() because
+			 * there is no smp_mb__after_set_bit() yet
+			 */
+			smp_mb__after_clear_bit();
+			if (atomic_read(&sk->sk_wmem_alloc) > limit)
+				break;
+
 		}
 
 		limit = mss_now;

-- cut here --

It fixes my case but if you look at the round trip time it is not even
close what it used to be. So while this fixes my problem I'm still for
disabling it by default.

https://thomas.glanzmann.de/tmp/tcp_auto_corking_on_patched.pcap.bz2
https://thomas.glanzmann.de/tmp/screenshot-mini-2014-02-08-14:36:25.png

Cheers,
        Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ