[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 3 Mar 2017 16:31:22 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Xin Long' <lucien.xin@...il.com>
CC: network dev <netdev@...r.kernel.org>,
"linux-sctp@...r.kernel.org" <linux-sctp@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Neil Horman <nhorman@...driver.com>,
Vlad Yasevich <vyasevich@...il.com>
Subject: RE: [PATCH net] sctp: change to save MSG_MORE flag into assoc
From: Xin Long
> Sent: 03 March 2017 15:43
...
> > It is much more important to get MSG_MORE working 'properly' for SCTP
> > than for TCP. For TCP an application can always use a long send.
> "long send" ?, you mean bigger data, or keeping sending?
> I didn't get the difference between SCTP and TCP, they
> are similar when sending data.
With tcp an application can always replace two send()/write()
calls with a single call to writev().
For sctp two send() calls must be made in order to generate two
data chunks.
So it is much easier for a tcp application to generate 'full'
ethernet packets.
>
> >
> > ...
> >> @@ -1982,6 +1982,7 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
> >> * breaks.
> >> */
> >> err = sctp_primitive_SEND(net, asoc, datamsg);
> >> + asoc->force_delay = 0;
> >> /* Did the lower layer accept the chunk? */
> >> if (err) {
> >> sctp_datamsg_free(datamsg);
> >
> > I don't think this is right - or needed.
> > You only get to the above if some test has decided to send data chunks.
> > So it just means that the NEXT time someone tries to send data all the
> > queued data gets sent.
> the NEXT time someone tries to send data with "MSG_MORE clear",
> yes, but with "MSG_MORE set", it will still delay.
>
> > I'm guessing that the whole thing gets called in a loop (definitely needed
> > for very long data chunks, or after the window is opened).
> yes, if users keep sending data chunks with MSG_MORE set, no
> data with "MSG_MORE clear" gap.
>
> > Now if an application sends a lot of (say) 100 byte chunks with MSG_MORE
> > set it would expect to see a lot of full ethernet frames be sent.
> right.
> > With the above a frame will be sent (containing all but 1 chunk) when the
> > amount of queued data becomes too large for an ethernet frame, and immediately
> > followed by a second ethernet frame with 1 chunk in it.
> "followed by a second ethernet frame with 1 chunk in it.", I think this's
> what you're really worried about, right ?
> But sctp flush data queue NOT like what you think, it's not keep traversing
> the queue untill the queue is empty.
> once a packet with chunks in one ethernet frame is sent, sctp_outq_flush
> will return. it will pack chunks and send the next packet again untill some
> other 'event' triggers it, like retransmission or data received from peer.
> I don't think this is a problem.
Erm.... that can't work.
I think there is code to convert a large user send into multiple data chunks.
So if the user does a 4k (say) send several large chunks get queued.
These would need to all be sent at once.
Similarly when the transmit window is received.
So somewhere there ought to be a loop that will send more than one packet.
> > Now it might be that the flag needs clearing when retransmissions are queued.
> > OTOH they might get sent for other reasons.
> Before we really overthought about MSG_MORE, no need to care about
> retransmissions, define MSG_MORE, in my opinion, it works more for
> *inflight is 0*, if it's not 0, we shouldn't stop other places flushing them.
Eh? and when nagle disabled.
If 'inflight' isn't 0 then most paths don't flush data.
> We cannot let asoc's more_more flag work as global, it will block elsewhere
> sending data chunks, not only sctp_sendmsg.
If the connection was flow controlled off, and more 'credit' arrives and there
is less that an ethernet frame's worth of data pending, and the last send
said 'MSG_MORE' there is no point sending anything until the application
does a send with MSG_MORE clear.
I'm not sure what causes a retransmission to send data, I suspect that 'inflight'
can easily be non-zero at that time.
Likely something causes a packet be generated - which then collects the data chunks.
David
Powered by blists - more mailing lists