[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20071007.151145.32909699.takano@axe-inc.co.jp>
Date: Sun, 07 Oct 2007 15:11:45 +0900 (JST)
From: TAKANO Ryousei <takano@...-inc.co.jp>
To: ilpo.jarvinen@...sinki.fi
Cc: netdev@...r.kernel.org, y-kodama@...t.go.jp
Subject: Re: [RFC][PATCH 2/2] TCP: skip processing cached SACK blocks
Hi Ilpo,
Thanks for your reply.
Most of my response are in the reply to patch 1/2.
From: "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
Subject: Re: [RFC][PATCH 2/2] TCP: skip processing cached SACK blocks
Date: Fri, 5 Oct 2007 13:37:21 +0300 (EEST)
> On Thu, 4 Oct 2007, TAKANO Ryousei wrote:
>
> > This patch allows to process only newly reported SACK blocks at the
> > sender side. An ACK packet contains up to three SACK blocks, and some
>
> "A SACK option that specifies n blocks will have a length of 8*n+2
> bytes, so the 40 bytes available for TCP options can specify a
> maximum of 4 blocks. It is expected that SACK will often be used in
> conjunction with the Timestamp option used for RTTM [Jacobson92],
> which takes an additional 10 bytes (plus two bytes of padding); thus
> a maximum of 3 SACK blocks will be allowed in this case." [RFC2018]
>
> :-)
>
Yes, indeed:-)
> > of them may be already reported and processed blocks. This patch
> > prevents processing of such already processed SACK blocks.
> >
> > Signed-off-by: Ryousei Takano <takano-ryousei@...t.go.jp>
> > Signed-off-by: Yuetsu Kodama <y-kodama@...t.go.jp>
> > ---
> > net/ipv4/tcp_input.c | 24 ++++++++++++++++++++++++
> > 1 files changed, 24 insertions(+), 0 deletions(-)
> >
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index bbad2cd..9615fc9 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -978,6 +978,7 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
> > int cached_fack_count;
> > int i;
> > int first_sack_index;
> > + u8 sack_block_skip[4] = {0,0,0,0};
> >
> > if (!tp->sacked_out)
> > tp->fackets_out = 0;
> > @@ -1012,6 +1013,21 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
> > if (before(TCP_SKB_CB(ack_skb)->ack_seq, prior_snd_una - tp->max_window))
> > return 0;
> >
> > + /* Skip processing cached SACK blocks. */
> > + for (i = 0; i < num_sacks; i++) {
> > + __be32 start_seq = sp[i].start_seq;
> > + __be32 end_seq = sp[i].end_seq;
> > + int j;
> > +
> > + for (j = 0; j < ARRAY_SIZE(tp->recv_sack_cache); j++) {
> > + if ((tp->recv_sack_cache[j].start_seq == start_seq) &&
> > + (tp->recv_sack_cache[j].end_seq == end_seq)) {
> > + sack_block_skip[i] = 1;
> > + break;
> > + }
> > + }
> > + }
> > +
>
> I'm somewhat against adding more and more special cases to sacktag,
> there's still need for more special cases after this one to avoid very
> expensive processing (I guess they just won't occur in your scenario)!
> ...I would rather remove whole special case mess of the fastpath and
> have a more generic solution (see the patch I point into in the reply
> to patch 1/2)...
>
> > /* SACK fastpath:
> > * if the only SACK change is the increase of the end_seq of
> > * the first block then only apply that SACK block
> > @@ -1051,11 +1067,16 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
> > if (after(ntohl(sp[j].start_seq),
> > ntohl(sp[j+1].start_seq))){
> > struct tcp_sack_block_wire tmp;
> > + u8 sbtmp;
> >
> > tmp = sp[j];
> > sp[j] = sp[j+1];
> > sp[j+1] = tmp;
> >
> > + sbtmp = sack_block_skip[j];
> > + sack_block_skip[j] = sack_block_skip[j+1];
> > + sack_block_skip[j+1] = sbtmp;
> > +
> > /* Track where the first SACK block goes to */
> > if (j == first_sack_index)
> > first_sack_index = j+1;
> > @@ -1083,6 +1104,9 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_
> > int fack_count;
> > int dup_sack = (found_dup_sack && (i == first_sack_index));
> >
> > + if (sack_block_skip[i])
>
> DSACKs must always be processed, so please add:
>
> && !dup_sack
>
I did not notice DSACKs. Thanks.
> > + continue;
>
> By doing this skipping here, you actually end up crippling lost_retrans
> detection even more than it was broken before. ...You probably didn't just
> notice that during tests because of unrelated suboptimal behavior (in
> fastpath_skb_hint handling). ...Anyway, correctness of this should be
> evaluated against the fixed lost_retrans, rather than the already
> broken one.
>
You can find the result that the average goodput slightly improves
against the only PATCH #1 (fixed lost_retrans) applied kernel.
Sorry, our web page is down this weekend for the power outage.
http://projects.gtrc.aist.go.jp/gnet/sack-bug.html
> > +
> > skb = cached_skb;
> > fack_count = cached_fack_count;
>
> Other than what's noted above:
>
> Acked-by: Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
>
>
> --
> i.
Regards,
Ryousei Takano
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists