netdev - Re: [PATCH 4/5] tcp: allow undo from reordered DSACKs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CADVnQykNffEvpr5bnLiLyNGZiuhxx=5wxqFaZjC+Dk4mxe1Okw@mail.gmail.com>
Date:	Thu, 17 Nov 2011 15:49:55 -0500
From:	Neal Cardwell <ncardwell@...gle.com>
To:	Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
Cc:	David Miller <davem@...emloft.net>,
	Netdev <netdev@...r.kernel.org>,
	Nandita Dukkipati <nanditad@...gle.com>,
	Yuchung Cheng <ycheng@...gle.com>,
	Tom Herbert <therbert@...gle.com>
Subject: Re: [PATCH 4/5] tcp: allow undo from reordered DSACKs

On Thu, Nov 17, 2011 at 12:18 AM, Ilpo Järvinen
<ilpo.jarvinen@...sinki.fi> wrote:
> On Wed, 16 Nov 2011, Neal Cardwell wrote:
>
>> Previously, SACK-enabled connections hung around in TCP_CA_Disorder
>> state while snd_una==high_seq, just waiting to accumulate DSACKs and
>> hopefully undo a cwnd reduction. This could and did lead to the
>> following unfortunate scenario: if some incoming ACKs advance snd_una
>> beyond high_seq then we were setting undo_marker to 0 and moving to
>> TCP_CA_Open, so if (due to reordering in the ACK return path) we
>> shortly thereafter received a DSACK then we were no longer able to
>> undo the cwnd reduction.
>>
>> The change: Simplify the congestion avoidance state machine by
>> removing the behavior where SACK-enabled connections hung around in
>> the TCP_CA_Disorder state just waiting for DSACKs. Instead, when
>> snd_una advances to high_seq or beyond we typically move to
>> TCP_CA_Open immediately and allow an undo in either TCP_CA_Open or
>> TCP_CA_Disorder if we later receive enough DSACKs.
>>
>> Other patches in this series will provide other changes that are
>> necessary to fully fix this problem.
>>
>> Signed-off-by: Neal Cardwell <ncardwell@...gle.com>
>> ---
>>  net/ipv4/tcp_input.c |   15 ++-------------
>>  1 files changed, 2 insertions(+), 13 deletions(-)
>>
>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>> index 751d390..a4efdd7 100644
>> --- a/net/ipv4/tcp_input.c
>> +++ b/net/ipv4/tcp_input.c
>> @@ -2858,7 +2858,7 @@ static void tcp_try_keep_open(struct sock *sk)
>>       struct tcp_sock *tp = tcp_sk(sk);
>>       int state = TCP_CA_Open;
>>
>> -     if (tcp_left_out(tp) || tcp_any_retrans_done(sk) || tp->undo_marker)
>> +     if (tcp_left_out(tp) || tcp_any_retrans_done(sk))
>>               state = TCP_CA_Disorder;
>>
>>       if (inet_csk(sk)->icsk_ca_state != state) {
>> @@ -3066,17 +3066,6 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked,
>>                       }
>>                       break;
>>
>> -             case TCP_CA_Disorder:
>> -                     tcp_try_undo_dsack(sk);
>> -                     if (!tp->undo_marker ||
>> -                         /* For SACK case do not Open to allow to undo
>> -                          * catching for all duplicate ACKs. */
>> -                         tcp_is_reno(tp) || tp->snd_una != tp->high_seq) {
>> -                             tp->undo_marker = 0;
>> -                             tcp_set_ca_state(sk, TCP_CA_Open);
>> -                     }
>> -                     break;
>> -
>>               case TCP_CA_Recovery:
>>                       if (tcp_is_reno(tp))
>>                               tcp_reset_reno_sack(tp);
>> @@ -3117,7 +3106,7 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked,
>>                               tcp_add_reno_sack(sk);
>>               }
>>
>> -             if (icsk->icsk_ca_state == TCP_CA_Disorder)
>> +             if (icsk->icsk_ca_state <= TCP_CA_Disorder)
>>                       tcp_try_undo_dsack(sk);
>>
>>               if (!tcp_time_to_recover(sk)) {
>
> How about extending Disorder state until second cumulative ACK that is
> acking >= high_seq?

That would seem to add complexity but only provide a partial solution.

This proposed patch has the virtue of providing a general solution
while simplifying the code a little.

What are your concerns with this patch?

Thanks, Ilpo, for taking a look at this patch sequence,
neal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html