lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <f2016893-bf9e-3b65-4fe8-ff1bba4f4ced@akamai.com>
Date:   Tue, 3 Dec 2019 09:24:34 -0800
From:   Josh Hunt <johunt@...mai.com>
To:     subashab@...eaurora.org, Eric Dumazet <eric.dumazet@...il.com>
Cc:     Neal Cardwell <ncardwell@...gle.com>,
        Netdev <netdev@...r.kernel.org>,
        Yuchung Cheng <ycheng@...gle.com>
Subject: Re: Crash when receiving FIN-ACK in TCP_FIN_WAIT1 state

On 11/29/19 6:51 PM, subashab@...eaurora.org wrote:
>>>> Since tcp_write_queue_purge() calls tcp_rtx_queue_purge() and we're 
>>>> deleting everything in the retrans queue there, doesn't it make 
>>>> sense to zero out all of those associated counters? Obviously 
>>>> clearing sacked_out is helping here, but is there a reason to keep 
>>>> track of lost_out, retrans_out, etc if retrans queue is now empty? 
>>>> Maybe calling tcp_clear_retrans() from tcp_rtx_queue_purge() ?
>>>
>>> First, I would like to understand if we hit this problem on current 
>>> upstream kernels.
>>>
>>> Maybe a backport forgot a dependency.
>>>
>>> tcp_write_queue_purge() calls tcp_clear_all_retrans_hints(), not 
>>> tcp_clear_retrans(),
>>> this is probably for a reason.
>>>
>>> Brute force clearing these fields might hide a serious bug.
>>>
>>
>> I guess we are all too busy to get more understanding on this :/
> 
> Our test devices are on 4.19.x and it is not possible to switch to a newer
> version. Perhaps Josh has seen this on a newer kernel.

Sorry I've been out of town without email access. To be clear I've never 
seen this crash. I've only noticed that we do not clear some counters 
when we clear out the retransmit queue and this caught my eye when 
debugging another unrelated issue. I will try and get some cycles this 
week to instrument a kernel and reproduce the behavior I was seeing. My 
concern IIRC was more around tcp_left_out() being > packets_out and 
retrans_out causing tcp_packets_in_flight() to wrap. Anyway I'll report 
my findings on this thread if they seem relevant otherwise maybe I'll 
start another discussion thread. I don't want to pollute this one with 
my ramblings...

Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ