netdev - Re: RE: A Linux TCP SACK Question

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0804081623500.21784@wrl-59.cs.helsinki.fi>
Date:	Tue, 8 Apr 2008 16:45:12 +0300 (EEST)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	Wenji Wu <wenji@...l.gov>
cc:	'Sangtae Ha' <sangtae.ha@...il.com>,
	'John Heffner' <johnwheffner@...il.com>,
	'Netdev' <netdev@...r.kernel.org>
Subject: Re: RE: A Linux TCP SACK Question

On Tue, 8 Apr 2008, Wenji Wu wrote:

> > NewReno never retransmitted anything in them (except at the very end 
> > of 
> > the transfer). Probably something related to how tp->reordering behaves
> > I suppose...
> 
> Yes, the adaptive tp->reordering will play a role here. 

...What is not clear to me why NewReno does not go to recovery at least 
once near the beginning, or at least it won't result in a retransmission.

In which kernel version this dump comes from? 2.6.24 newreno is crippled 
with TSO as was recently discovered, ie., it won't mark lost super skbs 
at head and thus won't retransmit them. Also 2.6.25-rcs are still broken 
(though they'll transmit too much, I'll not go detail in here), DaveM now 
has the fix for 2.6.25-rcs in net-2.6.

> > This is probably far fetched but could you tell us how you make sure 
> > that 
> > earlier connection's metrics are not affecting the latter connection? 
> > 
> > Ie., the discovered reordering is not transferred across the flows (in 
> > CBI 
> > like manner) and thus newreno has unfair advantage?
> 
> You can reverse the order of the tests, with SACK option on/off. The 
> results are still the same.

Ok. I just wanted to make sure so that we don't end up trace some test 
setup issue :-).

> Also, according to the source code, tp->reordering will be initialized 
> to "/proc/sys/net/ipv4/tcp_reordering" (default 3), when the new 
> connection is established.

In addition, in tcp_init_metrics():

	if (dst_metric(dst, RTAX_REORDERING) &&
            tp->reordering != dst_metric(dst, RTAX_REORDERING)) {
                tcp_disable_fack(tp);
                tp->reordering = dst_metric(dst, RTAX_REORDERING);
        }

> After that, tp->reordering is controlled by  the the adaptive algorithm

Yes, however, the algorithm will be vastly different in those two cases.
NewReno stuff is in tcp_check_reno_reordering() and other place in 
tcp_try_undo_partial() but the latter is only happening in recovery I 
think. SACK on the other has number of callsites to tcp_update_reordering, 
check for yourself.

This might be due to my change which made tcp_check_reno_reordering to be 
called earlier than it used to be (to remove a transition state during 
which sacked_out contained stale info including some already cumulative 
ACKed segments). I was quite unsure if I can safely do that. It's not 
clear to me how your test could cause sacked_out > packets_out-1 to occur 
though, which is necessary for tcp_update_reordering to get called with 
newreno. The ACK reordering should just make the number of duplicate acks 
smaller because part of them get discarded as old ones as a newer 
cumulative ACK often arrives a bit "ahead" of it's time making rest 
smaller sequenced ACKs very close to no-op. ...Though I didn't yet do a 
awk magic to prove that it won't happen in the non-sack dump.

-- 
 i.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html