lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.58.0910132335390.3095@u.domain.uli>
Date:	Wed, 14 Oct 2009 00:27:41 +0300 (EEST)
From:	Julian Anastasov <ja@....bg>
To:	Willy Tarreau <w@....eu>
cc:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: TCP_DEFER_ACCEPT is missing counter update


	Hello,

On Tue, 13 Oct 2009, Willy Tarreau wrote:

> >From da80c99a503bab1256706ed8d967e2ab3f71afe0 Mon Sep 17 00:00:00 2001
> From: Willy Tarreau <w@....eu>
> Date: Tue, 13 Oct 2009 07:26:54 +0200
> Subject: tcp: fix tcp_defer_accept to consider the timeout
> 
> I was trying to use TCP_DEFER_ACCEPT and noticed that if the
> client does not talk, the connection is never accepted and
> remains in SYN_RECV state until the retransmits expire, where
> it finally is deleted. This is bad when some firewall such as

	I think, this is by design, there is big comment in
tcp_check_req().

> netfilter sits between the client and the server because the
> firewall sees the connection in ESTABLISHED state while the
> server will finally silently drop it without sending an RST.

	Client can stay ESTABLISHED for long time but
RST will be sent when client sends DATA or FIN.

> This behaviour contradicts the man page which says it should
> wait only for some time :
> 
>        TCP_DEFER_ACCEPT (since Linux 2.4)
>           Allows a listener to be awakened only when data arrives
>           on the socket.  Takes an integer value  (seconds), this
>           can  bound  the  maximum  number  of attempts TCP will
>           make to complete the connection. This option should not
>           be used in code intended to be portable.

	This works properly in 2.6.31.3, I set TCP_SYNCNT=1
and TCP_DEFER_ACCEPT then only 2 SYN-ACKs are sent.

> Also, looking at ipv4/tcp.c, a retransmit counter is correctly
> computed :

	rskq_defer_accept is threshold, not counter

>         case TCP_DEFER_ACCEPT:
>                 icsk->icsk_accept_queue.rskq_defer_accept = 0;
>                 if (val > 0) {
>                         /* Translate value in seconds to number of
>                          * retransmits */
>                         while (icsk->icsk_accept_queue.rskq_defer_accept < 32 &&
>                                val > ((TCP_TIMEOUT_INIT / HZ) <<
>                                        icsk->icsk_accept_queue.rskq_defer_accept))
>                                 icsk->icsk_accept_queue.rskq_defer_accept++;
>                         icsk->icsk_accept_queue.rskq_defer_accept++;
>                 }
>                 break;
> 
> ==> rskq_defer_accept is used as a counter of retransmits.

	as limit for retransmits, not as counter

> But in tcp_minisocks.c, this counter is only checked. And in
> fact, I have found no location which updates it. So I think
> that what was intended was to decrease it in tcp_minisocks
> whenever it is checked, which the trivial patch below does.

	You can check net/ipv4/inet_connection_sock.c,
inet_csk_reqsk_queue_prune() where TCP_DEFER_ACCEPT can extend
the retransmission threshold for acked sockets above the
applied 'thresh'. So, there are 2 options:

a) TCP_DEFER_ACCEPT is used as flag (eg. 1) or the period is below
the TCP_SYNCNT period. In this case TCP_DEFER_ACCEPT does not
extend the period for DATA (DATA must come before TCP_SYNCNT).
Application is notified only when DATA comes.

or

b) TCP_DEFER_ACCEPT is set with seconds above the TCP_SYNCNT
retrans limit and the first ACK extends the period up to
TCP_DEFER_ACCEPT seconds (converted as retrans). By this
way we provide more time for DATA after the empty ACKs.
ACK again can come before TCP_SYNCNT but DATA after ACK
can come even after TCP_SYNCNT but before TCP_DEFER_ACCEPT
timeout. Again, application is notified only when DATA comes.

> Signed-off-by: Willy Tarreau <w@....eu>
> ---
>  net/ipv4/tcp_minisocks.c |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index f8d67cc..2f676f3 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -644,6 +644,7 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
>  	/* If TCP_DEFER_ACCEPT is set, drop bare ACK. */
>  	if (inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
>  	    TCP_SKB_CB(skb)->end_seq == tcp_rsk(req)->rcv_isn + 1) {

	This is wrong considering inet_csk_reqsk_queue_prune()
and patch should not be applied except if I'm missing something:

> +		inet_csk(sk)->icsk_accept_queue.rskq_defer_accept--;
>  		inet_rsk(req)->acked = 1;
>  		return NULL;
>  	}

Regards

--
Julian Anastasov <ja@....bg>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ