lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 25 Feb 2021 10:05:20 -0500
From:   Neal Cardwell <ncardwell@...gle.com>
To:     Gil Pedersen <kanongil@...il.com>
Cc:     David Miller <davem@...emloft.net>,
        Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        dsahern@...nel.org, Netdev <netdev@...r.kernel.org>,
        Yuchung Cheng <ycheng@...gle.com>,
        Eric Dumazet <edumazet@...gle.com>
Subject: Re: TCP stall issue

On Wed, Feb 24, 2021 at 10:36 AM Gil Pedersen <kanongil@...il.com> wrote:
>
>
> > On 24 Feb 2021, at 15.55, Neal Cardwell <ncardwell@...gle.com> wrote:
> >
> > On Wed, Feb 24, 2021 at 5:03 AM Gil Pedersen <kanongil@...il.com> wrote:
> >> Sure, I attached a trace from the server that should illustrate the issue.
> >>
> >> The trace is cut from a longer flow with the server at 188.120.85.11 and a client window scaling factor of 256.
> >>
> >> Packet 78 is a TLP, followed by a delayed DUPACK with a SACK from the client.
> >> The SACK triggers a single segment fast re-transmit with an ignored?? D-SACK in packet 81.
> >> The first RTO happens at packet 82.
> >
> > Thanks for the trace! That is very helpful. I have attached a plot and
> > my notes on the trace, for discussion.
> >
> > AFAICT the client appears to be badly misbehaving, and misrepresenting
> > what has happened.  At each point where the client sends a DSACK,
> > there is an apparent contradiction. Either the client has received
> > that data before, or it hasn't. If the client *has* already received
> > that data, then it should have already cumulatively ACKed it. If the
> > client has *not* already received that data, then it shouldn't send a
> > DSACK for it.
> >
> > Given that, from the server's perspective, the client is
> > misbehaving/lying, it's not clear what inferences the server can
> > safely make. Though I agree it's probably possible to do much better
> > than the current server behavior.
> >
> > A few questions.
> >
> > (a) is there a middlebox (firewall, NAT, etc) in the path?
> >
> > (b) is it possible to capture a client-side trace, to help
> > disambiguate whether there is a client-side Linux bug or a middlebox
> > bug?
>
> Yes, this sounds like a sound analysis, and matches my observation. The client is confused about whether it has the data or not.
>
> Unfortunately I only have that (un-rooted) device available, so I can't do traces on it. The connection path is Client -> Wi-Fi -> NAT -> NAT -> Internet -> Server (which has a basic UFW firewall).
> I will try to do a trace on the first NAT router.
>
> My first priority is to make the server behave better in this case, but I understand that you would like to investigate the client / connection issue as well? From the server POV, this is clearly an edge case, but a fast re-transmit does seem more appropriate.

Regarding improving the server's retransmit behavior and having it use
a fast retransmit here.

I don't think this is a bug in RACK, because the DSACK clearly
indicates that the retransmission was spurious, so all the packets
already marked lost by RACK are thus unmarked.

I guess the questions are:

(a) How would we craft a general heuristic that would cause a fast
retransmit here in the misbehaving receiver case, without causing lots
of spurious retransmits for well-behaved receivers? Do you have a
suggestion?

(b) Do we want to add the new complexity for this heuristic, given
that this is a misbehaving receiver and we don't yet have an
indication that it's a widespread bug?

> Btw. the "client SACKs TLP retransmit" note is not correct. This is an old ACK, which can be seen from the ecr value.

I believe your analysis of the ECR value here is incorrect. The TS ecr
value in ACKs with SACK blocks will generally not match the TS val of
the SACKed segment due to the rules of RFC 7323, specifically rule (2)
on page 17 in section 4.3 (
https://tools.ietf.org/html/rfc7323#section-4.3 ), which says:

   (2)  If:

            SEG.TSval >= TS.Recent and SEG.SEQ <= Last.ACK.sent

        then SEG.TSval is copied to TS.Recent; otherwise, it is ignored.

Because the sequence number on the SACKed TLP retransmit is >
Last.ACK.sent, SEG.TSval is *not* copied to TS.Recent, and so the TS
ecr value does not reflect the TS val of the TLP retransmit.

So AFAICT this is not an old ACK, but is indeed a SACK of the TLP retransmit.

best,
neal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ