lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53A071A2.20909@gmail.com>
Date:	Tue, 17 Jun 2014 12:49:38 -0400
From:	Vlad Yasevich <vyasevich@...il.com>
To:	David Laight <David.Laight@...LAB.COM>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: SCTP data chunk bundling when SCTP_NODELAY is set

On 06/17/2014 12:07 PM, David Laight wrote:
> From: Vlad Yasevich
>> On 06/17/2014 10:17 AM, David Laight wrote:
>>> If SCTP_NODELAY is set it is difficult to get SCTP to bundle
>>> data chunks into ethernet packets.
>>> This leads to very high packet rates which bundling could easily
>>> reduce by a factor or 8 or 10.
>>>
>>> Nagle can't really be enabled because it generates unwanted delays
>>> when traffic is light (Nagle only really works for unidirectional bulk
>>> data and command-response when the messages are smaller than the mtu).
>>>
>>> Even if the sending application knows it has more data to send,
>>> there isn't much it can do to get the chunks bundled.
>>>
>>> AFAICT 'corking' the socket even stops full sized packets being
>>> sent - so the application will deadlock if the socket write
>>> buffer size is reached before the socket is 'uncorked'.
>>> This also means that the application can't send back to back
>>> full sized packets unless it uncorks the socket at exactly
>>> the right places.
>>>
>>> MSG_MORE isn't supported by SCTP, but I'm not sure it would help.
>>> You really need a MSG_NO_MORE flag and to leave Nagle enabled.
>>>
>>> About the only thing I can think of is to normally have Nagle
>>> enabled, and then perform the following sequence to force the
>>> buffered data chunks be sent:
>>> 1) disable Nagle
>>> 2) cork the socket
>>> 3) uncork the socket
>>> 4) enable Nagle
>>> Four socket calls is a little excessive!
>>
>> First, how are you corking an SCTP socket?  There is no SCTP_CORK
>> and looking at the code, I don't see how an SCTP queue can be
>> cored by user...
> 
> I only looked as far as seeing that the code in sm_sideffect.c
> allows someone else to have corked the socket, and the effect
> that the 'cork' had.

Right.  Cork is pretty dumb right now.  It should work more like TCP
where if you've queued up enough data to bypass nagle checks, it should
flush some number for full MTU packets.

> 
>> I suppose we could implement SCTP_CORK to do the right thing.
>>
>> I thought is possibly utilizing something like sendmmsg() and passing
>> an extra flag to let it be know that this is a multi-message send
>> that should be queued up by sctp..
> 
> It would be as easy to expose the extra flag to the 'application'
> allowing it to use sendmsg() or sendmmsg().
> While sendmmsg() saves a system call, it is fairly horrid to use.
> (and I'm sending from a kernel driver so don't care about the 
> system call cost!)
> 
> Possibly MSG_MORE with Nagle disabled could invoke the Nagle send
> delay - but you'd need to know whether any chunks in the queue
> had MSG_MORE clear.

That's why doing this with cork would be simpler.  The ULP can just
queue up a bunch of small data and if we pass nagle checks, it will be
flushed.  If not, uncork will flush it.

I could work up a patch for you if you want.

Thanks
-vlad

> It would also be nice to have a 'send on local timeout', rather than
> Nagle's 'wait for everything to be acked'.
> 
> 	David
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ