[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTikh4YXzJGgTWsTjpz6BczrfSGYD_qSg3WWmKrs9@mail.gmail.com>
Date: Mon, 7 Feb 2011 16:22:59 -0800
From: Yuchung Cheng <ycheng@...gle.com>
To: Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
Cc: David Miller <davem@...emloft.net>, Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH] tcp: undo_retrans counter fixes
On Mon, Feb 7, 2011 at 3:36 PM, Ilpo Järvinen <ilpo.jarvinen@...sinki.fi> wrote:
>
> On Mon, 7 Feb 2011, David Miller wrote:
>
> > From: Yuchung Cheng <ycheng@...gle.com>
> > Date: Mon, 7 Feb 2011 14:57:04 -0800
> >
> > > Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
> > > not set or undo_retrans is already 0. This happens when sender receives
> > > more DSACK ACKs than packets retransmitted during the current
> > > undo phase. This may also happen when sender receives DSACK after
> > > the undo operation is completed or cancelled.
> > >
> > > Fix another bug that undo_retrans is incorrectly incremented when
> > > sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
> > > is rare but not impossible.
> > >
> > > Signed-off-by: Yuchung Cheng <ycheng@...gle.com>
> >
> > Looks good, Ilpo could you please review this real quick?
>
> I already too a quick look so you're real lucky, only delay of writing is
> needed... :-)
thanks.
>
> Neither is harmful to "fix" but I think they're partially also checking
> for things which shouldn't cause problems... E.g., undo_retrans is only
> used after checking undo_marker's validity first so I don't think
> undo_marker check is exactly necessary there (but on the other hand it
> does no harm)...
logically we should check the validity of undo_marker/undo_retrans
before we use them? The current code has no problem if
tcp_fastretrans_alert() always call tcp_try_undo_* functions whenever
undo_marker != 0 and undo_retrans == 0. I don't think that's always
true.
>
> The tcp_retransmit_skb problem I don't understand at all as we should be
> fragmenting or resetting pcount to 1 (the latter is true only if all
> bugfixes were included to the kernel where >1 pcount for a rexmitted skb
> was seen). If pcount is indeed >1 we might have other issues too somewhere
We found that sometimes pcount > 1 on real servers. This change keeps
the retrans_out/undo_retrans counters consistent.
> but I fail to remember immediately what they would be. That change is not
> bad though since using +/-1 is something we should be getting rid of
> anyway and on long term it would be nice to make tcp_retransmit_skb to be
> able to take advantage of TSO anyway whenever possible.
>
> I also noticed that the undo_retrans code in sacktag side is still doing
> undo_retrans-- ops which could certainly cause real miscounts, though
> it is extremely unlikely due to the fact that DSACK should be sent
> immediately for a single segment at a time (so the sender would need to
> split+recollapse in between).
I have the same doubt but our servers never hit this condition (pcount
> 1). So I keep this part intact.
>
> --
> i.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists