netdev - Re: Crash when receiving FIN-ACK in TCP_FIN

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <827f0898-df46-0f05-980e-fffa5717641f@akamai.com>
Date:   Wed, 30 Oct 2019 14:48:00 -0700
From:   Josh Hunt <johunt@...mai.com>
To:     Subash Abhinov Kasiviswanathan <subashab@...eaurora.org>,
        Neal Cardwell <ncardwell@...gle.com>
Cc:     Netdev <netdev@...r.kernel.org>, Yuchung Cheng <ycheng@...gle.com>,
        Eric Dumazet <eric.dumazet@...il.com>
Subject: Re: Crash when receiving FIN-ACK in TCP_FIN_WAIT1 state

On 10/30/19 11:27 AM, Subash Abhinov Kasiviswanathan wrote:
>> Thanks. Do you mind sharing what your patch looked like, so we can
>> understand precisely what was changed?
>>
>> Also, are you able to share what the workload looked like that tickled
>> this issue? (web client? file server?...)
> 
> Sure. This was seen only on our regression racks and the workload there
> is a combination of FTP, browsing and other apps.
> 
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index 4374196..9af7497 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -232,7 +232,8 @@ struct tcp_sock {
>                  fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */
>                  fastopen_no_cookie:1, /* Allow send/recv SYN+data 
> without a cookie */
>                  is_sack_reneg:1,    /* in recovery from loss with SACK 
> reneg? */
> -               unused:2;
> +               unused:1,
> +               wqp_called:1;
>          u8      nonagle     : 4,/* Disable Nagle algorithm? */
>                  thin_lto    : 1,/* Use linear timeouts for thin streams */
>                  recvmsg_inq : 1,/* Indicate # of bytes in queue upon 
> recvmsg */
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 1a1fcb3..0c29bdd 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2534,6 +2534,9 @@ void tcp_write_queue_purge(struct sock *sk)
>          INIT_LIST_HEAD(&tcp_sk(sk)->tsorted_sent_queue);
>          sk_mem_reclaim(sk);
>          tcp_clear_all_retrans_hints(tcp_sk(sk));
> +       tcp_sk(sk)->highest_sack = NULL;
> +       tcp_sk(sk)->sacked_out = 0;
> +       tcp_sk(sk)->wqp_called = 1;
>          tcp_sk(sk)->packets_out = 0;
>          inet_csk(sk)->icsk_backoff = 0;
>   }
> 
> 

Neal

Since tcp_write_queue_purge() calls tcp_rtx_queue_purge() and we're 
deleting everything in the retrans queue there, doesn't it make sense to 
zero out all of those associated counters? Obviously clearing sacked_out 
is helping here, but is there a reason to keep track of lost_out, 
retrans_out, etc if retrans queue is now empty? Maybe calling 
tcp_clear_retrans() from tcp_rtx_queue_purge() ?

Josh