netdev - Re: [PATCH v2 bpf-next 5/7] bpf: sysctl for probe_on

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAK6E8=c3dJtubxWWFYyVGj5THtbtLFVx69VbbepR5HNwkyc8WQ@mail.gmail.com>
Date:   Mon, 8 Apr 2019 10:38:40 -0700
From:   Yuchung Cheng <ycheng@...gle.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
Cc:     Neal Cardwell <ncardwell@...gle.com>, brakmo <brakmo@...com>,
        netdev <netdev@...r.kernel.org>, Martin Lau <kafai@...com>,
        Alexei Starovoitov <ast@...com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH v2 bpf-next 5/7] bpf: sysctl for probe_on_drop

On Mon, Apr 8, 2019 at 10:07 AM Eric Dumazet <eric.dumazet@...il.com> wrote:
>
>
>
> On 04/08/2019 09:16 AM, Neal Cardwell wrote:
> > On Wed, Apr 3, 2019 at 8:13 PM brakmo <brakmo@...com> wrote:
> >>
> >> When a packet is dropped when calling queue_xmit in  __tcp_transmit_skb
> >> and packets_out is 0, it is beneficial to set a small probe timer.
> >> Otherwise, the throughput for the flow can suffer because it may need to
> >> depend on the probe timer to start sending again. The default value for
> >> the probe timer is at least 200ms, this patch sets it to 20ms when a
> >> packet is dropped and there are no other packets in flight.
> >>
> >> This patch introduces a new sysctl, sysctl_tcp_probe_on_drop_ms, that is
> >> used to specify the duration of the probe timer for the case described
> >> earlier. The allowed values are between 0 and TCP_RTO_MIN. A value of 0
> >> disables setting the probe timer with a small value.
>
> This seems to contradict our recent work ?
>
> See recent Yuchung patch series :
>
> c1d5674f8313b9f8e683c265f1c00a2582cf5fc5 tcp: less aggressive window probing on local congestion
> 590d2026d62418bb27de9ca87526e9131c1f48af tcp: retry more conservatively on local congestion

I would appreciate a direct change to TCP stack starts with tcp:
subject instead of the confusing bpf for TCP developers.

packet being dropped at local layer is a sign of severe of congestion
-- it's caused by application bursting on many (idle or new)
connections. With this patch, the (many) connections that fail on the
first try (including SYN and pure ACKs) will all come back at 20ms,
instead of the RTTs-adjusted RTOs. So the end effect is the
application repetitively pounding the local qdisc through to squeeze
out some performance.

This patch seems to apply for a special case where local congestion
only lives for a very short period. I don't think it applies well as a
general principle for congestion control.







> 9721e709fa68ef9b860c322b474cfbd1f8285b0f tcp: simplify window probe aborting on USER_TIMEOUT
> 01a523b071618abbc634d1958229fe3bd2dfa5fa tcp: create a helper to model exponential backoff
> c7d13c8faa74f4e8ef191f88a252cefab6805b38 tcp: properly track retry time on passive Fast Open
> 7ae189759cc48cf8b54beebff566e9fd2d4e7d7c tcp: always set retrans_stamp on recovery
> 7f12422c4873e9b274bc151ea59cb0cdf9415cf1 tcp: always timestamp on every skb transmission
>