[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0706011417430.28601@kivilampi-30.cs.helsinki.fi>
Date: Fri, 1 Jun 2007 15:17:00 +0300 (EEST)
From: "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To: David Miller <davem@...emloft.net>
cc: Netdev <netdev@...r.kernel.org>
Subject: Re: Suspicious fackets_out handling
On Thu, 31 May 2007, David Miller wrote:
> From: "Ilpo_Järvinen" <ilpo.jarvinen@...sinki.fi>
> Date: Mon, 23 Apr 2007 14:28:21 +0300 (EEST)
>
> > There are IMHO two problems in it. First of all, nothing ensures that the
> > skb TCP is fragmenting is actually below the forwardmost sack block (and
> > thus is included to the fackets_out)...
...this is so old thing already that I had to refresh my memory,
haven't been expecting any answer for ages... :-)
> Good catch, I agree with your analysis completely.
...I later on understood (from a comment I found elsewhere) that
fackets_out is occasionally estimated, rather than exact value (which
could be completely fixed in tcp-2.6 due to presence of highest_sack but
I'm also considering complete removal of it too, like you probably
noticed).
> > What I'm not sure of though, is how to fix this in net-2.6(.22), it
> > is due to the fact that there is no pointer/seq which can be used in
> > testing for it like in tcp-2.6 which has the highest_sack.
>
> We can add highest_sack to 2.6.22 in order to fix a bug like this,
> if necessary.
...I think we can leave it as estimate for now, it has been like that
for years and nobody has complained... :-) This problem covers really
marginal number of cases anyway I think (isn't the diff usually
negative: old - new after fragment should be negative unless mss_now
bites).
After I found the comments and analyzed some sacked_out code with NewReno,
I think that usually these estimates are indicated in the code with
tcp_dec_pcount_approx(...) but it wasn't used in this specific case
because it inputs skb instead of int, I'll add clarification of this into
my tcp-2.6 todo list...
> > Second problem is even more obvious: if adjustment here is being
> > done and the sacktag code then uses fastpath at the arrival of the
> > next ACK, the sacktag code will use a stale value from
> > fastpath_cnt_hint and fails to notice all that math TCP did here
> > unless we start clearing fastpath_skb_hint.
>
> Let's try not to clear fastpath_skb_hint here if possible :-)
Np, however, whatever we do, it wouldn't really execute that
often... :-)
> Is it possible to update fastpath_cnt_hint properly perhaps?
I think that would be valid and even accurate as it can checks skb's
seq against fastpath_skb_hint->seq. I'm in a hurry and will be a week
out of reach of internet connectivity but here's quick attempt to deal
with this cleanly. Compile tested (be extra careful with this one, it's
done i a hurry :-)), consider to net-2.6. Considering the marginality of
this issue, stable might really be an overkill for this one, only a
remotely valid concern comes to my mind in this case: possibility
of fackets_out > packet_out might not be dealt that cleanly everywhere
(but I'm sure that you can come to a good decicion about it):
[PATCH] [TCP]: SACK fastpath did override adjusted fackets_out
Do same adjustment to SACK fastpath counters provided that
they're valid.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
---
net/ipv4/tcp_output.c | 9 +++++++++
1 files changed, 9 insertions(+), 0 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 53232dd..f24dd36 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -699,6 +699,15 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, unsigned int mss
tp->fackets_out -= diff;
if ((int)tp->fackets_out < 0)
tp->fackets_out = 0;
+ /* SACK fastpath might overwrite it unless dealt with */
+ if (tp->fastpath_skb_hint != NULL &&
+ after(TCP_SKB_CB(tp->fastpath_skb_hint)->seq,
+ TCP_SKB_CB(skb)->seq)) {
+ tp->fastpath_cnt_hint -= diff;
+ if ((int)tp->fastpath_cnt_hint < 0)
+ tp->fastpath_cnt_hint = 0;
+
+ }
}
}
--
1.5.0.6
Powered by blists - more mailing lists