lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVcG=F6FrcQ3PhGz27t9+6jGh0TGACfVQ6CeUX=EjNsAw@mail.gmail.com>
Date:	Wed, 26 Oct 2011 12:35:22 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Rick Jones <rick.jones2@...com>
Cc:	netdev@...r.kernel.org
Subject: Re: [PATCH] Add TCP_NO_DELAYED_ACK socket option

On Wed, Oct 26, 2011 at 10:56 AM, Rick Jones <rick.jones2@...com> wrote:
> On 10/25/2011 07:25 PM, Andy Lutomirski wrote:
>>
>> When talking to an unfixable interactive peer that fails to set
>> TCP_NODELAY, disabling delayed ACKs can help mitigate the problem.
>> This is an evil thing to do, but if the entire network is private,
>> it's not that evil.
>>
>> This works around a problem with the remote *application*, so make
>> it a socket option instead of a sysctl or a per-route option.
>>
>> Signed-off-by: Andy Lutomirski<luto@...capital.net>
>> ---
>>
>> This patch is a bit embarrassing.  We talk to remote applications over
>> TCP that are very much interactive but don't set TCP_NODELAY.  These
>> applications apparently cannot be fixed.  As a partial workaround, if we
>> ACK every incoming segment, then as long as they don't transmit two
>> segments per rtt, we do pretty well.
>
> Embarrassing/evil indeed - is it really something to go into the kernel?

That's a good question.  It's in our kernel -- I don't know whether it
should go upstream.

>
> If the networks where this happens are indeed truly private, can they run a
> private kernel?  Or use an LD_PRELOAD hack to wedge-in a
> setsockopt(TCP_NODELAY) call into the application?  Or set something like
> tcp_naglim_def on the application system(s)?  Or have the server application
> make a setsockopt(TCP_MAXSEG) call before listen() to a value one byte below
> that of what the application is sending?

We control our server.  We don't control the server at the other end.
We've tried to get them to do any of the above, but they seem
unwilling or unable to do it.  I suspect that they're using various
pieces from various third-party vendors that just don't care.

>
> Is the application actually "virtuous" in sending logically associated data
> in one "send" call, and simply running afoul of Nagle+DelayedACK in having
> multiple distinct requests outstanding at once, or is it actually quite evil
> in that it is sending logically associated data in separate send calls?
>

The remote application generates messages meant for us, and they
appear to send each message in its own segment.  I don't have the
source, so I don't know whether they're really using one send call per
message or whether they're using MSG_MORE, TCP_CORK, so some other
mechanism.  Each message is time-sensitive and should be received as
soon as possible afterq its sent (i.e. one-half rtt).  Unfortunately,
when they send two messages and we don't ack the first one, the second
gets delayed.  Turning off delayed acks helps but does not completely
solve the problem.

> rick jones
>
> choir preaching follows:

:)  I agree.  Unfortunately I didn't write all this stuff.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ