lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.58.0910151120340.2879@u.domain.uli>
Date:	Thu, 15 Oct 2009 11:47:51 +0300 (EEST)
From:	Julian Anastasov <ja@....bg>
To:	Willy Tarreau <w@....eu>
cc:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
	eric.dumazet@...il.com
Subject: Re: TCP_DEFER_ACCEPT is missing counter update


	Hello,

On Thu, 15 Oct 2009, Willy Tarreau wrote:

> BTW, I found a use case I didn't think about where current behaviour
> causes trouble :
> 
>    https://bugs.launchpad.net/ubuntu/+source/apache2/+bug/134274
>    http://lkml.indiana.edu/hypermail/linux/kernel/0711.0/0461.html
> 
> In summary, when front proxies establish pools of connections to
> an apache server making use of TCP_DEFER_ACCEPT, the connection
> never establishes on the apache server but silently expires in
> SYN_RECV state. The front proxy sees lots of SYN/ACKs and sends
> many ACKs trying to complete this connection and finally believes
> it got it since the server eventually becomes silent. However,
> when trying to send data over such a socket, the server immediately
> returns an RST.

	Such proxies using open connections for insane long
time should be prepared to retry idempotent methods such as
GET and to send POST methods to fresh connections. They can
close idle connections, say, after 5 seconds. Even if
server runs with TCP_DEFER_ACCEPT=OFF there is possibility
server to send FIN while request is flying (servers are
configured with some period to wait for first request).

> Such a problem would not happen if we would only drop the first
> X packets (X >= 1 is already fine), because the front proxy would
> establish the connection, send a second ACK in response to the
> second SYN/ACK and the connection would then really be established
> and would not have to expire early in SYN_RECV state.
> 
> If we really want to behave as it does today, well, let's not fix
> it, but obviously, I fail to see what real world use it has, except
> causing random and hard to debug issues :-/

	The reason is that in SYN_RECV state the server
saves resources. Socket and FD are created on DATA and possibly
for the short time while response is sent. If server is lucky
such resources will live miliseconds. Short responses can
be sent together with FIN. OTOH, servers running
with TCP_DEFER_ACCEPT=OFF can live with some wakeups (epoll
is fast enough) but the problem is that they have sockets
for longer time (the difference between first ACK and first DATA).

> Reading the articles below clearly make it think it was designed
> to help with HTTP connections by skipping the first expected and
> useless ACK packet before waking up the task :
> 
>    http://httpd.apache.org/docs/1.3/misc/perf-bsd44.html
>    http://articles.techrepublic.com.com/5100-10878_11-1050771.html
> 
> and people still get caught :
> 
>    http://lkml.indiana.edu/hypermail/linux/kernel/0711.0/0416.html

	They wait for minutes because they do not configure
TCP_SYNCNT. TCP_DEFER_ACCEPT works as expected if configured
properly.

> Maybe it was a bit over-engineered, in the end causing it to fail
> to satisfy the primary goal ?

	If one changes TCP_DEFER_ACCEPT to create socket it
will save wakeups but not resources. I'm wondering if the
behavior should be changed at all. For me the options are two:

a) you want to save resources: use TCP_DEFER_ACCEPT. To help
proxies use large values for TCP_SYNCNT and TCP_DEFER_ACCEPT.

b) you can live with wakeups and many sockets: do not use
TCP_DEFER_ACCEPT. Suitable for servers using short timeouts
for first request.

> Regards,
> Willy

Regards

--
Julian Anastasov <ja@....bg>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ