[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-id: <op.u7vsquqxp498uc@nexus>
Date: Tue, 09 Feb 2010 23:29:08 +0100
From: Damian Lukowski <damian@....rwth-aachen.de>
To: Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
Cc: David Miller <davem@...emloft.net>, Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH 0/3][v2] tcp: fix ICMP-RTO war
Am 09.02.2010, 13:37 Uhr, schrieb Ilpo Järvinen
<ilpo.jarvinen@...sinki.fi>:
> On Mon, 8 Feb 2010, Damian Lukowski wrote:
>
>> Damian Lukowski schrieb:
>> > Am 01.02.2010, 08:33 Uhr, schrieb David Miller <davem@...emloft.net>:
>> >
>> >> From: Damian Lukowski <damian@....rwth-aachen.de>
>> >> Date: Fri, 29 Jan 2010 23:15:51 +0100
>> >>
>> >>> This patches fix the current RTO calculation routine, when
>> >>> srtt and rttvar are zero, yielding an RTO of zero
>> >>> Under some circumstances, TCPs srtt and rttvar are zero,
>> >>> yielding a calculated RTO of zero.
>> >>> This is particularly unfortunate for ICMP based RTO recalculation
>> >>> as introduced in f1ecd5d9e736660 (Revert Backoff [v3]: Revert RTO
>> >>> on ICMP destination unreachable), as it results in RTO
>> retransmission
>> >>> flooding.
>> >>>
>> >>> Thanks to Ilpo Jarvinen for providing debug patches and to
>> >>> Denys Fedoryshchenko for reporting and testing.
>> >>>
>> >>> Signed-off-by: Damian Lukowski <damian@....rwth-aachen.de>
>> >>
>> >> I still haven't seen a detailed enough analysis of why these
>> >> tiny RTOs can come to exist in the first place.
>> >>
>> >> Please show me a list of events, function by function, the value of
>> >> relevant variables and per-socket TCP state, in the TCP stack, that
>> >> show how this ends up happening.
>> >>
>> >> Thanks for all of your work on this so far.
>> >
>> > I might have figured it out, but could not verify it, so maybe you can
>> > comment my thought.
>> >
>> > When a listening TCP receives a SYN, it will send a SYN+ACK
>> > and wait for an ACK to complete the handshake.
>> > Look at tcp_rcv_state_process::step 5::case SYN_RECV::acceptable
>> > and the code after the comment "tcp_ack considers this ACK as
>> duplicate
>> > and does not calculate rtt".
>> >
>> > If the connecting client has disabled timestamps, the rtt statistics
>> > won't be updated here, while the state is changed above.
>> > I printk'ed at the very end of TCP_SYN_RECV and got the following:
>> > state 1 (ESTABLISHED), srtt 0, rttvar 0.
>> >
>> > So my suspicion is: If connectivity breaks right after a listening TCP
>> > has completed the handshake without timestamps, and the listening TCP
>> > sends data after establishing the connection, we will get the observed
>> > behaviour.
>>
>> Just verified that by dropping pure ACKs coming from the originally
>> listening TCP using iptables.
>
> Isn't that "tcp_ack considers" comment like this: we have bug elsewhere
> in
> the code but workaround it here, at least for some of the cases? (I'd put
> it other way around: but alas, only for part of the cases.) ...It sound
> like that to me. What exactly is the reason why rtt shouldn't be
> calculated/initialized on such ACK, anything I'm missing?
The RTT is extracted when traversing tcp_write_queue_head() in
tcp_clean_rtx_queue() but is null when the SYNACK is acknowledged,
so it may have seemed to be a harder issue for the commentator.
I also have been thinking a while how to obtain the timestamp of the
sent SYNACK, but there is no need for it.
We just need to call tcp_ack_no_tstamp(), which will set the
RTO to 3 seconds as defined in RFC 2988.
Regards
Damian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists