[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1234544555.28913.451.camel@ragnarok>
Date: Fri, 13 Feb 2009 12:02:35 -0500
From: Jeremy Jackson <jerj@...lanar.net>
To: Bill Fink <billfink@...dspring.com>
Cc: ilpo.jarvinen@...sinki.fi, Evgeniy Polyakov <zbr@...emap.net>,
bert hubert <bert.hubert@...herlabs.nl>,
"H. Willstrand" <h.willstrand@...il.com>,
Netdev <netdev@...r.kernel.org>
Subject: Re: sendfile()? Re: SO_LINGER dead: I get an immediate RST on
2.6.24?
On Tue, 2009-01-13 at 00:31 -0500, Bill Fink wrote:
> On Mon, 12 Jan 2009, Ilpo Järvinen wrote:
>
> > On Sun, 11 Jan 2009, Bill Fink wrote:
> >
> > > On Mon, 12 Jan 2009, Evgeniy Polyakov wrote:
> > >
> > > > On Mon, Jan 12, 2009 at 12:08:24AM +0100, bert hubert (bert.hubert@...herlabs.nl) wrote:
> > > > > I fully understand. Sometimes I have to talk to stupid devices though. What
An excellent article on this subject:
http://ds9a.nl/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable.txt
"Luckily, it turns out that Linux keeps track of the amount of
unacknowledged
data, which can be queried using the SIOCOUTQ ioctl(). Once we see this
number hit 0, we can be reasonably sure our data reached at least the
remote
operating system."
is this the same as the TCP_INFO getsockopt() ?
if you follow the progression from write(socket_fd, ) ... the data sits
in
the socket buffer, and SIOCOUTQ is initially zero. If the connection
started with a zero window,
it could sit like that for a while (sometimes called a "tarpit ?). But,
you should still see the data in your socket buffer, yes?
So, I think you want to make sure your socket write buffer is empty
(converted to unacked data), *then* make sure your unacked data is 0.
write(sock, buffer, 1000000); // returns 1000000
shutdown(sock, SHUT_WR);
now wait for SIOCOUTQ to hit 0.
if window is 0, shutdown() would wait until show device sets window > 0
again, or forever on a tarpitted connection. Either way, when if/when
it finishes, you know all data was transmitted, now wait for all of it
to be ACKed with SIOCOUTQ.
> > > > > I do find is the TCP_INFO ioctl, which offers this field in struct tcp_info:
> > > > >
> > > > > __u32 tcpi_unacked;
> > > > >
> > > > > Which comes from:
> > > > >
> > > > > struct tcp_sock {
> > > > > ...
> > > > > u32 packets_out; /* Packets which are "in flight" */
> > > > > ...
> > > > > }
> > > > >
> > > > > If this becomes 0, perhaps this might tell me everything I sent was acked?
> > > >
> > > > 0 means that there are noin-flight packets, which is effectively number
> > > > of unacked packets. So if your application waits for this field to
> > > > become zero, it will wait for all sent packets to be acked.
> > >
> > > I use this type of strategy in nuttcp, and it seems to work fine.
> > > I have a loop with a small delay and a check of tcpi_unacked, and
> > > break out of the loop if tcpi_unacked becomes 0 or a defined timeout
> > > period has passed.
> >
> > Checking tcpi_unacked alone won't be reliable. The peer might be slow
> > enough to advertize zero window for a short period of time and during
> > that period you would have packets_out zero...
>
> I'll keep this in mind for the future, although it doesn't seem to
> be a significant issue in practice. I use this scheme to try and
> account for the tcpi_total_retrans for the data stream, so if this
> corner case was hit, it would mean an under reporting of the total
> TCP retransmissions for the nuttcp test.
>
> If I understand you correctly, to hit this corner case, just after
> the final TCP write, there would have to be no packets in flight
> together with a zero TCP window. To make it more bullet-proof, I
> guess after seeing a zero tcpi_unacked, an additional small delay
> should be performed, and then rechecking for a zero tcpi_unacked.
> I don't see anything else obvious (to me anyway) in the tcp_info
> that would be particularly helpful in handling this.
--
Jeremy Jackson
Coplanar Networks
(519)489-4903
http://www.coplanar.net
jerj@...lanar.net
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists