netdev - Re: RFC: Nagle latency tuning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 09 Sep 2008 01:56:12 -0400
From:	Chris Snook <csnook@...hat.com>
To:	David Miller <davem@...emloft.net>
CC:	rick.jones2@...com, netdev@...r.kernel.org
Subject: Re: RFC: Nagle latency tuning

David Miller wrote:
> From: Chris Snook <csnook@...hat.com>
> Date: Tue, 09 Sep 2008 01:10:05 -0400
> 
>> This is open to debate, but there are certainly a great many apps
>> doing a great deal of very important business that are subject to
>> this problem to some degree.
> 
> Let's be frank and be honest that we're talking about message passing
> financial service applications.

Mostly.

> And I specifically know that the problem they run into is that the
> congestion window doesn't open up because of Nagle _AND_ the fact that
> congestion control is done using packet counts rather that data byte
> totals.  So if you send lots of small stuff, the window doesn't open.
> Nagle just makes this problem worse, rather than create it.
> 
> And we have a workaround for them, which is a combination of the
> tcp_slow_start_after_idle sysctl in combination with route metrics
> specifying the initial congestion window value to use.
> 
> I specifically added that sysctl for this specific situation.

That's not the problem I'm talking about here.  The problem I'm seeing 
is that if your burst of messages is too small to fill the MTU, the 
network stack will just sit there and stare at you for precisely 40 ms 
(an eternity for a financial app) before transmitting.  Andi may be 
correct that it's actually the delayed ACK we're seeing, but I can't 
figure out where that 40 ms magic number is coming from.

The easiest way to see the problem is to open a TCP socket to an echo 
daemon on loopback, make a bunch of small writes totaling less than your 
loopback MTU (accounting for overhead), and see how long it takes to get 
your echoes.  You can probably do this with netcat, though I haven't 
tried.  People don't expect loopback to have 40 ms latency when the box 
is lightly loaded, so they'd really like to tweak that down when it's 
hurting them.

-- Chris
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html