lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <746fd1ba6d994ecf8d6e9854abb75409@exmbdft6.ad.twosigma.com>
Date:   Sun, 13 Feb 2022 19:26:32 +0000
From:   Tian Lan <Tian.Lan@...sigma.com>
To:     Eric Dumazet <edumazet@...gle.com>
CC:     Tian Lan <tilan7663@...il.com>, netdev <netdev@...r.kernel.org>,
        "Andrew Chester" <Andrew.Chester@...sigma.com>
Subject: RE: [PATCH] tcp: allow the initial receive window to be greater than
 64KiB

> I suggest that you do not interpret things as " BPF_SOCK_OPS_RWND_INIT could exceed 64KiB"  because it can not.

> If you really need to send more than 64KB in the first RTT, TCP is not a proper protocol.

> 13d3b1ebe287 commit message should have been very clear about the 64K limitation.

I'm not trying to make the sender to send more than 64Kib in the first RTT. The change will only make the sender to send more starting on the second RTT(after first ack received on the data). Instead of having the rcv_wnd to grow from 64Kib, the rcv_wnd can start from a much larger base value.

Without the patch:
 
RTT:                                1,                   2,	            3,  ...
rcv_wnd:                64KiB,        192KiB,         576KiB,  ...

With the patch (assume rcv_wnd is set to 512KiB):      

RTT:                                1,                    2,	            3,   ...
rcv_wnd:                64KiB,    1.536MiB,    4.608MiB,  ...

Also, it doesn't seem like the commit 13d3b1ebe287 specify anything about 64KiB limitation
https://github.com/torvalds/linux/commit/13d3b1ebe28762c79e981931a41914fae5d04386


-----Original Message-----
From: Eric Dumazet <edumazet@...gle.com> 
Sent: Sunday, February 13, 2022 1:58 PM
To: Tian Lan <Tian.Lan@...sigma.com>
Cc: Tian Lan <tilan7663@...il.com>; netdev <netdev@...r.kernel.org>; Andrew Chester <Andrew.Chester@...sigma.com>
Subject: Re: [PATCH] tcp: allow the initial receive window to be greater than 64KiB

On Sun, Feb 13, 2022 at 10:52 AM Tian Lan <Tian.Lan@...sigma.com> wrote:
>
> > To be clear, if the sender respects the initial window in first RTT 
> > , then first ACK it will receive allows a much bigger window (at 
> > least 2x),  allowing for standard slow start behavior, doubling CWND 
> > at each RTT>
> >
> > linux TCP stack is conservative, and wants a proof of remote peer well behaving before opening the gates.
> >
> > The thing is, we have this issue being discussed every 3 months or so, because some people think the RWIN is never changed or something.
> >
> > Last time, we asked to not change the stack, and instead suggested users tune it using eBPF if they really need to bypass TCP standards.
> >
> > https://lkml.org/lkml/2021/12/22/652
>
> I totally understand that Linux wants to be conservative before opening up the gate and I'm fully support of this idea. I think the current Linux behavior is good for network with low latency, but in an environment with high RTT (i.e 20ms), the rcv_wnd really becomes the bottleneck. It took approximately 6 * RTT on average for 4MiB transfer even with large initial snd_cwnd. I think allowing a larger default rcv_wnd would greatly reduce the number of RTT required for the transfer.
>
> From my understanding, BPF_SOCK_OPS_RWND_INIT was added to the kernel to allow the users to by-pass the default if they choose to. Prior to kernel 4.19, the rcv_wnd set via BPF_SOCK_OPS_RWND_INIT could exceed 64KiB and up to the space. But since then, the initial rwnd would always be limited to the 64KiB. This patch would just make the kernel behave similarly to the kernel prior to 4.19 if rcv_wnd is set by eBPF.
>
> What would you suggest for the application that currently relies on setting a "larger" rcv_wnd via BPF_SOCK_OPS_RWND_INIT, do you think if it is a better idea if the rcv_wnd is set after the connection is established.

I suggest that you do not interpret things as " BPF_SOCK_OPS_RWND_INIT could exceed 64KiB"  because it can not.

If you really need to send more than 64KB in the first RTT, TCP is not a proper protocol.

13d3b1ebe287 commit message should have been very clear about the 64K limitation.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ