lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CADVnQynYn0hLJdmBE+=sy8jtVNi6xZexq0T+Lzukub60ft0VAA@mail.gmail.com>
Date:   Mon, 20 Nov 2017 10:05:47 -0500
From:   Neal Cardwell <ncardwell@...gle.com>
To:     Steve Ibanez <sibanez@...nford.edu>
Cc:     Daniel Borkmann <daniel@...earbox.net>,
        Netdev <netdev@...r.kernel.org>, Florian Westphal <fw@...len.de>,
        Mohammad Alizadeh <alizadeh@...il.mit.edu>,
        Lawrence Brakmo <brakmo@...com>
Subject: Re: Linux ECN Handling

On Mon, Nov 20, 2017 at 2:31 AM, Steve Ibanez <sibanez@...nford.edu> wrote:
> Hi Folks,
>
> I wanted to check back in on this for another update and to solicit
> some more suggestions. I did a bit more digging to try an isolate the
> problem.

Going back to one of your Oct 19 trace snapshots (attached), AFAICT at
the time of the timeout there is actually almost 64KBytes  (352553398
+ 1448 - 352489686 = 65160) of unacknowledged data. So there really
does seem to be a significant chunk of packets that were in-flight
that were then declared lost.

So here is a possibility: perhaps the combination of CWR+PRR plus
tcp_tso_should_defer() means that PRR can make cwnd so gentle that
tcp_tso_should_defer() thinks we should wait for another ACK to send,
and that ACK doesn't come. Breaking it, down, the potential sequence
would be:

(1) tcp_write_xmit() does not send, because the CWR behavior, using
PRR, does not leave enough cwnd for tcp_tso_should_defer() to think we
should send (PRR was originally designed for recovery, which did not
have TSO deferral)

(2) TLP does not fire, because we are in state CWR, not Open

(3) The only remaining option is an RTO, which fires.

In other words, the possibility is that, at the time of the stall, the
cwnd is reasonably high, but tcp_packets_in_flight() is also quite
high, so either there is (a) literally no unused cwnd left (
tcp_packets_in_flight() == cwnd), or (b) some mechanism like
tcp_tso_should_defer() is deciding that there is not enough available
cwnd for it to make sense to chop off a fraction of a TSO skb to send
now.

One way to test that conjecture would be to disable
tcp_tso_should_defer() by adding a:

   goto send_now;

at the top of tcp_tso_should_defer().

If that doesn't prevent the freezes then I would recommend adding
printks or other instrumentation to  tcp_write_xmit() to log:

- time
- ca_state
- cwnd
- ssthresh
- tcp_packets_in_flight()
- the reason for breaking out of the tcp_write_xmit() loop (tso
deferral, no packets left, tcp_snd_wnd_test, tcp_nagle_test, etc)

cheers,
neal

Download attachment "han-3_timeout-event.png" of type "image/png" (91597 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ