[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <A2BAEFC30C8FD34388F02C9B3121859D1C24FDDC@eusaamb103.ericsson.se>
Date: Fri, 9 May 2014 18:07:28 +0000
From: Jon Maloy <jon.maloy@...csson.com>
To: David Laight <David.Laight@...LAB.COM>,
'Jon Maloy' <maloy@...jonn.com>,
Erik Hugne <erik.hugne@...csson.com>
CC: "davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Paul Gortmaker <paul.gortmaker@...driver.com>,
"ying.xue@...driver.com" <ying.xue@...driver.com>,
"tipc-discussion@...ts.sourceforge.net"
<tipc-discussion@...ts.sourceforge.net>
Subject: RE: [PATCH net-next 1/8] tipc: decrease connection flow control
window
> -----Original Message-----
> From: David Laight [mailto:David.Laight@...LAB.COM]
> Sent: May-09-14 12:35 PM
> To: 'Jon Maloy'; Erik Hugne
> Cc: Jon Maloy; davem@...emloft.net; netdev@...r.kernel.org; Paul
> Gortmaker; ying.xue@...driver.com; tipc-discussion@...ts.sourceforge.net
> Subject: RE: [PATCH net-next 1/8] tipc: decrease connection flow control
> window
>
> From: Jon Maloy
> > On 05/09/2014 10:59 AM, David Laight wrote:
> > > From: erik.hugne@...csson.
> > >> On Fri, May 09, 2014 at 01:30:43PM +0000, David Laight wrote:
> > >>> Sounds a bit like a badly designed protocol to me...
> > >> Well, i don't like this either but the current message-oriented
> > >> flowcontrol will eventually be replaced with a byte-based one.
> > >> Right now we're trying to find a way to adapt a message-oriented
> > >> flow control to per-socket buffer constraints, without breaking protocol
> compatibility.
> > > I wasn't thinking of the byte/message flow control problems.
> > > More of only requesting acks every 512 messages
> >
> > Acks are not requested, they are sent out unsolicited at fix
> > intervals, controlled by the reading capacity of the receiving
> > process.
>
> Requested by the received side...
??? Sent by the receiving side.
>
> > The fundamental problem we have is that we acknowledge messages,
> > without considering their size, so a sender won't stop (there is no
> > active stop message from the receiver) until he has sent 512 messages
> > and not received an ack, which in the worst case amounts to 512 x
> > truesize(64KB) = 67 MB outstanding data. This is what the receiving
> > socket must be able to absorb, in order to avoid broken connections.
>
> Even 512 x 64kB is a lot.
> I thought that a 64k skb would only have a truesize of 72k (with 4k pages).
> I'm also guessing that an ethernet controller that can do txp rx offloading
> might have to allocate 64k skb - so a single 1500 byte ethernet frame is likely
> to be in an skb with a truesize of 72k.
> I'm not sure if this affects your 128k truesize though.
The 128KiB comes from locally allocated buffers. TIPC, being an IPC protocol,
does not use the loopback interface, and hence does not need fragmentation
for node local communication. This gives a latency improvement of ~3 times
as compared to TCP.
>
> (Pasted from below)
> > TIPC connections always run over a reliable media, - its own link
> > layer. This means that the connection layer does not have any form of
> > sequence numbering or retransmission, -it just assumes that everything
> > arrives in order and without losses, as long as the link layer doesn't
> > tell it otherwise.
>
> Hmmm... If multiple connections are multiplexed over a single link you are
> completely stuffed!
Not as much as you would think. First, we are not limited to one link, we
can have parallel links across different LANS/VLANs. As a matter of fact,
TIPC was running circles around TCP until a few years ago, but now it is
admittedly behind when it comes to total throughput/interface. Recent
tests using GRO and variable link transmission windows still shows that we
an easily fill 2x10G links, which is no less than TCP can do given the physical
constraint, and it is anyway the maximum most systems have these days.
It remains to be seen what happens when we go to 40G or 100G, but
we have some ideas even there.
///jon
>
> David
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists