netdev - Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are disabled

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.20.1806151048000.29120@whs-18.cs.helsinki.fi>
Date:   Fri, 15 Jun 2018 11:05:03 +0300 (EEST)
From:   Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
To:     Michal Kubecek <mkubecek@...e.cz>
cc:     Yuchung Cheng <ycheng@...gle.com>, netdev <netdev@...r.kernel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH RESEND] tcp: avoid F-RTO if SACK and timestamps are
 disabled

On Thu, 14 Jun 2018, Michal Kubecek wrote:

> On Thu, Jun 14, 2018 at 02:51:18PM +0300, Ilpo Järvinen wrote:
> > On Thu, 14 Jun 2018, Michal Kubecek wrote:
> > > On Thu, Jun 14, 2018 at 11:42:43AM +0300, Ilpo Järvinen wrote:
> > > > On Wed, 13 Jun 2018, Yuchung Cheng wrote:
> > > > > On Wed, Jun 13, 2018 at 9:55 AM, Michal Kubecek <mkubecek@...e.cz> wrote:
> > > 
> > > AFAICS RFC 5682 is not explicit about this and offers multiple options.
> > > Anyway, this is not essential and in most of the customer provided
> > > captures, it wasn't the case.
> > 
> > Lacking the new segments is essential for hiding the actual bug as the 
> > trace would look weird otherwise with a burst of new data segments (due 
> > to the other bug).
> 
> The trace wouldn't look so nice but it can be reproduced even with more
> data to send. I've copied an example below. I couldn't find a really
> nice one quickly so that first few retransmits (17:22:13.865105 through
> 17:23:05.841105) are without new data but starting at 17:23:58.189150,
> you can see that sending new (previously unsent) data may not suffice to
> break the loop.

My point was that the new data segment bursts that occur if the sender 
isn't application limited indicate that there's something going wrong
with FRTO. And that wrong is also what is causing that RTO loop because
the sender doesn't see the previous FRTO recovery on second RTO. With 
my FRTO undo fix, (new_recovery || icsk->icsk_retransmits) will be false
and that will prevent the RTO loop.

> > > Normally, we would have timestamps (and even SACK). Without them, you
> > > cannot reliably recognize a dupack with changed window size from
> > > a spontaneous window update.
> > 
> > No! The window should not update window on ACKs the receiver intends to 
> > designate as "duplicate ACKs". That is not without some potential cost 
> > though as it requires delaying window updates up to the next cumulative 
> > ACK. In the non-SACK series one of the changes is fixing this for
> > non-SACK Linux TCP flows.
> 
> That sounds like a reasonable change (at least at the first glance,
> I didn't think about it too deeply) but even if we fix Linux stack to
> behave like this, we cannot force everyone else to do the same.

Unfortunately I don't know what the other stacks besides Linux do. But 
for Linux, the cause for the changing receiver window is the receiver 
window auto-tuning and I'm not sure if other stacks have a similar 
feature (or if that affects (almost) all ACKs like in Linux).


-- 
 i.