lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iL38aCi4TWMePFwHDbUzJgV+mGEVMVukx8Z636EPCWEag@mail.gmail.com>
Date:   Fri, 1 Apr 2022 06:54:46 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     Jaco Kroon <jaco@....co.za>
Cc:     Neal Cardwell <ncardwell@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Netdev <netdev@...r.kernel.org>,
        Yuchung Cheng <ycheng@...gle.com>
Subject: Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections

On Fri, Apr 1, 2022 at 4:36 AM Jaco Kroon <jaco@....co.za> wrote:
>
> Hi Eric,
>
> On 2022/04/01 02:54, Eric Dumazet wrote:
> > On Thu, Mar 31, 2022 at 5:41 PM Eric Dumazet <edumazet@...gle.com> wrote:
> >> On Thu, Mar 31, 2022 at 5:33 PM Jaco Kroon <jaco@....co.za> wrote:
> >>
> >>> I'll deploy same on a dev host we've got in the coming week and start a
> >>> bisect process.
> >> Thanks, this will definitely help.
> > One thing I noticed in your pcap is a good amount of drops, as if
> > Hystart was not able to stop slow-start before the drops are
> > happening.
> >
> > TFO with one less RTT at connection establishment could be the trigger.
> >
> > If you are still using cubic, please try to revert.
> Sorry, I understand TCP itself a bit, but I've given up trying to
> understand the various schedulers a long time ago and am just using the
> defaults that the kernel provides.  How do I check what I'm using, and
> how can I change that?  What is recommended at this stage?

How to check: cat /proc/sys/net/ipv4/tcp_congestion_control"

This is of course orthogonal to the buf we are tracking here,
but given your long RTT, I would recommend using fq packet scheduler and bbr.

tc qd replace dev eth0 root fq   # or use mq+fq if your NIC is multi
queue and you need a good amount of throughput

insmod tcp_bbr   # (after enabling CONFIG_TCP_CONG_BBR=m)
echo bbr >/proc/sys/net/ipv4/tcp_congestion_control


> >
> >
> > commit 4e1fddc98d2585ddd4792b5e44433dcee7ece001
> > Author: Eric Dumazet <edumazet@...gle.com>
> > Date:   Tue Nov 23 12:25:35 2021 -0800
> >
> >     tcp_cubic: fix spurious Hystart ACK train detections for
> > not-cwnd-limited flows
> Ok, instead of starting with bisect, if I can reproduce in dev I'll use
> this one first.

Thanks ! (again this won't fix the bug, this is really a shoot in the dark)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ