netdev - Re: [Patch net-next 3/3] tcp: decouple TLP timer from RTO timer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iKN5TiSYaZEwmUHKMXv+Rg1onNOZDOkNRHNKKpPbo2eaw@mail.gmail.com>
Date:   Wed, 23 Oct 2019 11:55:30 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     netdev <netdev@...r.kernel.org>, Yuchung Cheng <ycheng@...gle.com>,
        Neal Cardwell <ncardwell@...gle.com>
Subject: Re: [Patch net-next 3/3] tcp: decouple TLP timer from RTO timer

On Wed, Oct 23, 2019 at 11:30 AM Cong Wang <xiyou.wangcong@...il.com> wrote:
>
> On Wed, Oct 23, 2019 at 11:14 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > > In case you misunderstand, the CPU profiling I used is captured
> > > during 256 parallel TCP_STREAM.
> >
> > When I asked you the workload, you gave me TCP_RR output, not TCP_STREAM :/
> >
> > "A single netperf TCP_RR could _also_ confirm the improvement:"
>
> I guess you didn't understand what "also" mean? The improvement
> can be measured with both TCP_STREAM and TCP_RR, only the
> CPU profiling is done with TCP_STREAM.
>

Except that I could not measure any gain with TCP_RR, which is expected,
since TCP_RR will not use RTO and TLP at the same time.

If you found that we were setting both RTO and TLP when sending 1-byte message,
we need to fix the stack, instead of working around it.

> BTW, I just tested an unpatched kernel on a machine with 64 CPU's,
> turning on/off TLP makes no difference there, so this is likely related
> to the number of CPU's or hardware configurations. This explains
> why you can't reproduce it on your side, so far I can only reproduce
> it on one particular hardware platform too, but it is real.
>

I have hosts with 112 cpus, I can try on them, but it will take some time.