[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0710041516170.31129@kivilampi-30.cs.helsinki.fi>
Date: Fri, 5 Oct 2007 13:37:21 +0300 (EEST)
From: "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To: TAKANO Ryousei <takano@...-inc.co.jp>
cc: Netdev <netdev@...r.kernel.org>, y-kodama@...t.go.jp
Subject: Re: [RFC][PATCH 2/2] TCP: skip processing cached SACK blocks
On Thu, 4 Oct 2007, TAKANO Ryousei wrote:
> This patch allows to process only newly reported SACK blocks at the
> sender side. An ACK packet contains up to three SACK blocks, and some
"A SACK option that specifies n blocks will have a length of 8*n+2
bytes, so the 40 bytes available for TCP options can specify a
maximum of 4 blocks. It is expected that SACK will often be used in
conjunction with the Timestamp option used for RTTM [Jacobson92],
which takes an additional 10 bytes (plus two bytes of padding); thus
a maximum of 3 SACK blocks will be allowed in this case." [RFC2018]
:-)
> of them may be already reported and processed blocks. This patch
> prevents processing of such already processed SACK blocks.
>
> Signed-off-by: Ryousei Takano <takano-ryousei@...t.go.jp>
> Signed-off-by: Yuetsu Kodama <y-kodama@...t.go.jp>
> ---
> net/ipv4/tcp_input.c | 24 ++++++++++++++++++++++++
> 1 files changed, 24 insertions(+), 0 deletions(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index bbad2cd..9615fc9 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -978,6 +978,7 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
> int cached_fack_count;
> int i;
> int first_sack_index;
> + u8 sack_block_skip[4] = {0,0,0,0};
>
> if (!tp->sacked_out)
> tp->fackets_out = 0;
> @@ -1012,6 +1013,21 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
> if (before(TCP_SKB_CB(ack_skb)->ack_seq, prior_snd_una - tp->max_window))
> return 0;
>
> + /* Skip processing cached SACK blocks. */
> + for (i = 0; i < num_sacks; i++) {
> + __be32 start_seq = sp[i].start_seq;
> + __be32 end_seq = sp[i].end_seq;
> + int j;
> +
> + for (j = 0; j < ARRAY_SIZE(tp->recv_sack_cache); j++) {
> + if ((tp->recv_sack_cache[j].start_seq == start_seq) &&
> + (tp->recv_sack_cache[j].end_seq == end_seq)) {
> + sack_block_skip[i] = 1;
> + break;
> + }
> + }
> + }
> +
I'm somewhat against adding more and more special cases to sacktag,
there's still need for more special cases after this one to avoid very
expensive processing (I guess they just won't occur in your scenario)!
...I would rather remove whole special case mess of the fastpath and
have a more generic solution (see the patch I point into in the reply
to patch 1/2)...
> /* SACK fastpath:
> * if the only SACK change is the increase of the end_seq of
> * the first block then only apply that SACK block
> @@ -1051,11 +1067,16 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
> if (after(ntohl(sp[j].start_seq),
> ntohl(sp[j+1].start_seq))){
> struct tcp_sack_block_wire tmp;
> + u8 sbtmp;
>
> tmp = sp[j];
> sp[j] = sp[j+1];
> sp[j+1] = tmp;
>
> + sbtmp = sack_block_skip[j];
> + sack_block_skip[j] = sack_block_skip[j+1];
> + sack_block_skip[j+1] = sbtmp;
> +
> /* Track where the first SACK block goes to */
> if (j == first_sack_index)
> first_sack_index = j+1;
> @@ -1083,6 +1104,9 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
> int fack_count;
> int dup_sack = (found_dup_sack && (i == first_sack_index));
>
> + if (sack_block_skip[i])
DSACKs must always be processed, so please add:
&& !dup_sack
> + continue;
By doing this skipping here, you actually end up crippling lost_retrans
detection even more than it was broken before. ...You probably didn't just
notice that during tests because of unrelated suboptimal behavior (in
fastpath_skb_hint handling). ...Anyway, correctness of this should be
evaluated against the fixed lost_retrans, rather than the already
broken one.
> +
> skb = cached_skb;
> fack_count = cached_fack_count;
Other than what's noted above:
Acked-by: Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
--
i.
Powered by blists - more mailing lists