lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 13 Feb 2014 06:58:19 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Florian Westphal <fw@...len.de>
Cc:	netdev@...r.kernel.org, Neal Cardwell <ncardwell@...gle.com>,
	Yuchung Cheng <ycheng@...gle.com>
Subject: Re: [PATCH next resend] tcp: use zero-window when free_space is low

On Thu, 2014-02-13 at 12:52 +0100, Florian Westphal wrote:
> Currently the kernel tries to announce a zero window when free_space
> is below the current receiver mss estimate.
> 
> When a sender is transmitting small packets and reader consumes data
> slowly (or not at all), receiver might be unable to shrink the receive
> win because
> 
> a) we cannot withdraw already-commited receive window, and,
> b) we have to round the current rwin up to a multiple of the wscale
>    factor, else we would shrink the current window.
> 
> This causes the receive buffer to fill up until the rmem limit is hit.
> When this happens, we start dropping packets.
> 
> Moreover, tcp_clamp_window may continue to grow sk_rcvbuf towards rmem[2]
> even if socket is not being read from.
> 
> As we cannot avoid the "current_win is rounded up to multiple of mss"
> issue [we would violate a) above] at least try to prevent the receive buf
> growth towards tcp_rmem[2] limit by attempting to move to zero-window
> announcement when free_space becomes less than 1/16 of the current
> allowed receive buffer maximum.  If tcp_rmem[2] is large, this will
> increase our chances to get a zero-window announcement out in time.
> 
> Reproducer:
> On server:
> $ nc -l -p 12345
> <suspend it: CTRL-Z>
> 
> Client:
> #!/usr/bin/env python
> import socket
> import time
> 
> sock = socket.socket()
> sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
> sock.connect(("192.168.4.1", 12345));
> while True:
>    sock.send('A' * 23)
>    time.sleep(0.005)
> 
> 
> socket buffer on server-side will grow until tcp_rmem[2] is hit,
> at which point the client rexmits data until -EDTIMEOUT:
> 
> tcp_data_queue invokes tcp_try_rmem_schedule which will call
> tcp_prune_queue which calls tcp_clamp_window().  And that function will
> grow sk->sk_rcvbuf up until it eventually hits tcp_rmem[2].
> 
> Cc: Neal Cardwell <ncardwell@...gle.com>
> Cc: Eric Dumazet <eric.dumazet@...il.com>
> Cc: Yuchung Cheng <ycheng@...gle.com>
> Signed-off-by: Florian Westphal <fw@...len.de>
> ---
>  V1 of this patch was deferred, resending to get discussion going again.
>  Changes since v1:
>   - add reproducer to commit message
> 
>  Unfortunately I couldn't come up with something that has no magic
>  ('allowed >> 4') value.  I chose >>4 (1/16th) because it didn't cause
>  tput limitations in my 'full-mss-sized, steady state' netcat tests.
> 
>  Maybe someone has better idea?

Thanks a lot Florian looking at this.

Do we have one SNMP counter tracking number of time we took the decision
to send a 0 window ?

Would you mind waiting we run our packetdrill tests before acknowledging
this patch, because I suspect this might have some impact ?

Thanks !


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists