lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5878E166.8080800@oracle.com>
Date:   Fri, 13 Jan 2017 17:17:10 +0300
From:   Alexey Kodanev <alexey.kodanev@...cle.com>
To:     David Miller <davem@...emloft.net>,
        Eric Dumazet <eric.dumazet@...il.com>
CC:     netdev@...r.kernel.org, Vasily Isaenko <vasily.isaenko@...cle.com>
Subject: tcp: performance issue with fastopen connections (mss > window)

Hi,

Got the issue when running LTP/netstress test on localhost with mss
greater than the send window advertised by client (right after 3WHS).
Here is the testscenario that can reproduce this:

TCP client is sending 32 bytes request, TCP server replies with 65KB answer.
net.ipv4.tcp_fastopen set to 3. Also notethat the first TCP Fastopen
connectionprocessed without delay as tcp_send_mss()setshalf of the window
sizeto the'size_goal' inside tcp_sendmsg().

Though on the 2nd and subsequent connections:

< S  seq 0:0 win 43690 options [mss 65495 wscale 7
          tfo cookie ac6246a51d5422fc] length 32
 > S.seq 0:0ack 1win 43690 options [mss 65495wscale 7] length 0
<.ack 1 win 342 length 0

Inside tcp_sendmsg(), tcp_send_mss() returns 65483 in 'mss_now',as well as
in 'size_goal'. This results the segment not queued for transmition 
until all
data copied from userbuffer. Then, inside  __tcp_push_pending_frames() it
breaks on send window test,continue with the check probe timer, thus
introducing 200ms delay here.

Fragmentationoccurs in tcp_write_wakeup()...

+0.2> P. seq 1:43777 ack 1 win 342 length 43776
<. ack 43777, win 1365 length 0
      > P. seq 43777:65001 ack 1 win 342 optionslength 21224
      ...


Not sure what is the right fix for this, I guess we could limit 'size_goal'
to the current window or mss, what is currently less, e.g:

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 4a04496..3d3bd97 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -860,7 +860,12 @@ static unsigned int tcp_xmit_size_goal(struct sock 
*sk, u32 mss_now,
                 size_goal = tp->gso_segs * mss_now;
         }

-       return max(size_goal, mss_now);
+       size_goal = max(size_goal, mss_now);
+
+       if (tp->snd_wnd > TCP_MSS_DEFAULT)
+               return min(tp->snd_wnd, size_goal);
+
+       return size_goal;
  }

  static int tcp_send_mss(struct sock *sk, int *size_goal, int flags)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 1d5331a..0ac133f 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2445,7 +2445,7 @@ void tcp_push_one(struct sock *sk, unsigned int 
mss_now)
  {
         struct sk_buff *skb = tcp_send_head(sk);

-       BUG_ON(!skb || skb->len < mss_now);
+       BUG_ON(!skb);

         tcp_write_xmit(sk, mss_now, TCP_NAGLE_PUSH, 1, sk->sk_allocation);
  }


Any ideas?

Thanks,
Alexey

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ