[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <063D6719AE5E284EB5DD2968C1650D6D463CC4@AcuExch.aculab.com>
Date: Wed, 22 Jan 2014 15:30:48 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Vlad Yasevich' <vyasevich@...il.com>,
'Matija Glavinic Pecotic' <matija.glavinic-pecotic.ext@....com>,
"linux-sctp@...r.kernel.org" <linux-sctp@...r.kernel.org>
CC: Alexander Sverdlin <alexander.sverdlin@....com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH] net: sctp: Fix a_rwnd/rwnd management to reflect real
state of the receiver's buffer
From: Vlad Yasevich
...
> > IIRC the 'size' taken of the socket buffer is the skb's 'true size'
> > and that includes any padding before and after the actual rx data.
> > For short packets the driver may have copied the data into a smaller
> > skb, for long packets it is likely to be more than that of a full
> > length ethernet packet.
> > In either case it can be significantly more than sizeof(sk_buff)
> > (190?) plus the size of the actual data.
>
> SCTP currently doesn't support GRO, so each packet is limited to
> ethernet packet plus sk_buff overhead.
The ethernet driver might still pass up a 2k buffer, or even a 4k one.
Especially if it supports GRO for TCP.
> What throws a real monkey
> wrench into this whole accounting business is SCTP bundling. If you
> bundle multiple messages into the single packet, accounting for it
> is a mess.
How about dividing the 'truesize' by the reference count?
(except that isn't quite right...)
I assume there are multiple skb headers but only one data buffer?
At least the chunks are all for the same connection so end up on
one socket (except I remember some other horrid stuff for datagrams).
> > I'm also not sure that continuously removing 'credit' is a good idea.
> > I've done a lot of comms protocol code, removing credit and 'window
> > slamming' acks are not good ideas.
>
> This patch doesn't continuously remove 'credit'. It advertises the
> closest approximation of the window that we support and computes it
> the same way as we do for Initial Window (available sk_rcvbuff >> 1).
> As the receiver drains the receive queue, available buffer will increase
> and the available window will grow.
Let's assume the socket buffer size is 100k.
We advertise a window of 50k.
We now receive 100 bytes of data, the remote has 49900 window left.
The 'truesize' is something like 190+64(ish)+100+tail_pad, say 400.
Socket buffer space is reduced to 99600 and any ack would give 49800.
So we have reduced the window by 100 bytes.
With that much space it probably doesn't matter much.
However if the connection is receive limited then the remote system
will have a second packet in flight that uses the rest of the window.
We then receive an in-sequence but out-of-window packet that refers
to window that we had previously given to the remote system.
I don't know what the sctp (or tcp) code does with such packets.
In my experience it is best to treat them as valid data (unless
there are larger free memory issues) and ack them at some point.
Hopefully the rules of the underlying protocol let you do this!
If you discard these packets then every packet gets sent twice
(or even more often if the data is very short).
David
Powered by blists - more mailing lists