netdev - Re: [PATCH] tcp: use rto_min value from socket in retransmits timeout

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CADVnQykFrPByw82NHm-L00cqhaSCuBNAmYbkkJ06SGNitqkxEw@mail.gmail.com>
Date:   Fri, 30 Jul 2021 11:08:59 -0400
From:   Neal Cardwell <ncardwell@...gle.com>
To:     Dmitry Yakunin <zeil@...dex-team.ru>
Cc:     kafai@...com, edumazet@...gle.com, netdev@...r.kernel.org,
        bpf@...r.kernel.org, dmtrmonakhov@...dex-team.ru,
        Yuchung Cheng <ycheng@...gle.com>,
        Soheil Hassas Yeganeh <soheil@...gle.com>,
        mitradir@...dex-team.ru
Subject: Re: [PATCH] tcp: use rto_min value from socket in retransmits timeout

On Fri, Jul 30, 2021 at 8:37 AM Dmitry Yakunin <zeil@...dex-team.ru> wrote:
>
> Hello, Neal!
>
> Thanks for your reply and explanations.
>
> I agree with all your points, about safe defaults for both timeouts
> and the number of retries. But what the patch does is not changing the
> defaults, it only provides a way to work with these values through
> bpf, which is important in an environment that is way different from
> cellular networks. For example in the modern DC the rto_min value
> should correspond with real RTT, that definitely not 200ms.

It seems your patch and your analysis are conflating several different issues:

(1) how long should rto_min be in datacenter environments?
(2) for reliability/robustness, how long should TCP retry to transmit
data before giving up?
(2) should rto_min just correspond to the real RTT, or other factors
(like delayed ACK timers)?

I am talking about the reliability/robustness cost of your proposal to
tie custom reductions in (1) to automatic custom reductions in (2).
(I'm not talking about safe defaults.)

If BPF or routing table entries customize rto_min, then it's great for
the rto_min knob to customize the RTO timer value to use a lower value
in datacenters to speed up loss recovery (1) (as already happens).

But just because you customize (1) does not imply that it is safe to
massively reduce the answer to (2): it is not safe to cripple
reliability/robustness by (as in your proposed patch) having the
rto_min setting massively reduce the length of time that a TCP
connection retries sending data before giving up and closing the
connection.

The problem caused by your proposal to have rto_min shorten the retry
duration (e.g. a 5ms rto_min leading to only 1.275 seconds of retries)
is a general problem of reliability/robustness, not specific to
cellular paths. My point about cellular networks was just the most
crisp example I could think of, to try to provide a clear and concrete
example.

If you really think it's important for TCP connections to only retry
sending data for 1.275 seconds, then can you please give an example of
when this is important, and then please implement a separate
customization mechanism for that, rather than forcing all Linux users
of the rto_min mechanism to suffer the fallout from tying (1) to (2)?

best regards,
neal