lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 26 Oct 2011 22:35:17 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Rick Jones <rick.jones2@...com>
Cc:	netdev@...r.kernel.org
Subject: Re: [PATCH] Add TCP_NO_DELAYED_ACK socket option

On Wed, Oct 26, 2011 at 1:06 PM, Rick Jones <rick.jones2@...com> wrote:
>>> If the networks where this happens are indeed truly private, can they run
>>> a
>>> private kernel?  Or use an LD_PRELOAD hack to wedge-in a
>>> setsockopt(TCP_NODELAY) call into the application?  Or set something like
>>> tcp_naglim_def on the application system(s)?  Or have the server
>>> application
>>> make a setsockopt(TCP_MAXSEG) call before listen() to a value one byte
>>> below
>>> that of what the application is sending?
>>
>> We control our server.  We don't control the server at the other end.
>> We've tried to get them to do any of the above, but they seem
>> unwilling or unable to do it.  I suspect that they're using various
>> pieces from various third-party vendors that just don't care.
>
> Making the setsockopt(TCP_MAXSEG) would be at your end :)  Presumably based
> on the minimum message size.  That would cause the connection to have an MSS
> == the request size so every request send should take the "is this send plus
> any queued unsent data >= MSS" path.

That's cute.  The messages are variable-size (but they don't vary
much), so doing this would probably be worse for the network than
having them set TCP_NODELAY or having us turn off delayed acks, but we
don't really care about the network, and it might work well.

>
> Another "at your end" possibility would be setting a rather small SO_RCVBUF
> size at your end before calling listen(), in hopes of triggering the window
> update.

That scares me.  If they every start sending in bursts (it happens on
occasion), then we lose if they would want to exceed an artificially
small window.

>
>>> Is the application actually "virtuous" in sending logically associated
>>> data
>>> in one "send" call, and simply running afoul of Nagle+DelayedACK in
>>> having
>>> multiple distinct requests outstanding at once, or is it actually quite
>>> evil
>>> in that it is sending logically associated data in separate send calls?
>>>
>>
>> The remote application generates messages meant for us, and they
>> appear to send each message in its own segment.  I don't have the
>> source, so I don't know whether they're really using one send call per
>> message or whether they're using MSG_MORE, TCP_CORK, so some other
>> mechanism.  Each message is time-sensitive and should be received as
>> soon as possible afterq its sent (i.e. one-half rtt).  Unfortunately,
>> when they send two messages and we don't ack the first one, the second
>> gets delayed.  Turning off delayed acks helps but does not completely
>> solve the problem.
>
> If it is write,write,read  (multiple sends per logical message) in a packet
> trace you should see a partial request in the first segment, followed by the
> rest of the request  (and perhaps the second through Nth) in the second
> segment.  Or, I suppose your server application would have a receive
> complete with the first part of the first request, getting the second part
> of the request in a subsequent receive call.
>
> If it is multiple requests at a time each sent in one send call, you should
> see a first segment arriving with a complete request within it, followed by
> a second segment with the next request(s).

These are asynchronous messages and we don't reply to the vast
majority of them.  We see one request arriving per segment.

I'll play with TCP_MAXSEG.  But I'll probably leave TCP_NO_DELAYED_ACK
patched in to my kernel for the time being.  I'm not thrilled about
forcing the other side to split their messages across multiple
segments.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ