[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1345917157.19483.1063.camel@edumazet-glaptop>
Date: Sat, 25 Aug 2012 19:52:37 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Cristian Rodríguez <crrodriguez@...nsuse.org>
Cc: netdev@...r.kernel.org, Yuchung Cheng <ycheng@...gle.com>,
Neal Cardwell <ncardwell@...gle.com>
Subject: Re: BUG: soft lockup - CPU#6 stuck for 22s! [httpd2-event:15597]
On Sat, 2012-08-25 at 13:47 +0200, Eric Dumazet wrote:
> On Sat, 2012-08-25 at 11:14 +0200, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@...gle.com>
> >
> > On Sat, 2012-08-25 at 10:59 +0200, Eric Dumazet wrote:
> > > On Fri, 2012-08-24 at 20:50 -0400, Cristian Rodríguez wrote:
> > > > Hi, the issue I reported with IPV6 few weeks ago seems to be gone, but
> > > > now I am getting the following crash..
> >
> > > Oh, I now see the bug, I'll send a patch asap
> >
> > Please try the following fix.
> >
> > Thanks !
>
> Well, this v2 seems cleaner :
>
> [PATCH v2] tcp: tcp_slow_start() should not decrease snd_cwnd
>
> Cristian Rodríguez reported various lockups in TCP stack,
> introduced by commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start())
>
> We could exit tcp_slow_start() with a zeroed snd_cwnd,
> and next time we enter tcp_slow_start(), we run an infinite loop.
>
> Reported-by: Cristian Rodríguez <crrodriguez@...nsuse.org>
> Cc: Yuchung Cheng <ycheng@...gle.com>
> Cc: Neal Cardwell <ncardwell@...gle.com>
> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> ---
> net/ipv4/tcp_cong.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
> index 1432cdb..e656c72 100644
> --- a/net/ipv4/tcp_cong.c
> +++ b/net/ipv4/tcp_cong.c
> @@ -337,7 +337,7 @@ void tcp_slow_start(struct tcp_sock *tp)
> tp->snd_cwnd_cnt -= tp->snd_cwnd;
> delta++;
> }
> - tp->snd_cwnd = min(tp->snd_cwnd + delta, tp->snd_cwnd_clamp);
> + tp->snd_cwnd = clamp(tp->snd_cwnd + delta, tp->snd_cwnd, tp->snd_cwnd_clamp);
> }
> EXPORT_SYMBOL_GPL(tcp_slow_start);
>
>
Hmm...
We probably have a bug in tcp_metrics.c, because snd_cwnd_clamp should
not be zero.
With RCU, it seems following code in tcpm_new() is racy :
tm->tcpm_addr = *addr;
...
tcpm_suck_dst(tm, dst);
Coupled with the code in tcpm_suck_dst(tm, dst)
static void tcpm_suck_dst(struct tcp_metrics_block *tm, struct dst_entry *dst)
{
u32 val;
tm->tcpm_stamp = jiffies;
val = 0;
if (dst_metric_locked(dst, RTAX_RTT))
val |= 1 << TCP_METRIC_RTT;
if (dst_metric_locked(dst, RTAX_RTTVAR))
val |= 1 << TCP_METRIC_RTTVAR;
if (dst_metric_locked(dst, RTAX_SSTHRESH))
val |= 1 << TCP_METRIC_SSTHRESH;
if (dst_metric_locked(dst, RTAX_CWND))
val |= 1 << TCP_METRIC_CWND;
if (dst_metric_locked(dst, RTAX_REORDERING))
val |= 1 << TCP_METRIC_REORDERING;
tm->tcpm_lock = val;
// HERE we set tcpm_lock before the tcpm_vals[]
tm->tcpm_vals[TCP_METRIC_RTT] = dst_metric_raw(dst, RTAX_RTT);
tm->tcpm_vals[TCP_METRIC_RTTVAR] = dst_metric_raw(dst, RTAX_RTTVAR);
tm->tcpm_vals[TCP_METRIC_SSTHRESH] = dst_metric_raw(dst, RTAX_SSTHRESH);
tm->tcpm_vals[TCP_METRIC_CWND] = dst_metric_raw(dst, RTAX_CWND);
tm->tcpm_vals[TCP_METRIC_REORDERING] = dst_metric_raw(dst, RTAX_REORDERING);
tm->tcpm_ts = 0;
tm->tcpm_ts_stamp = 0;
tm->tcpm_fastopen.mss = 0;
tm->tcpm_fastopen.syn_loss = 0;
tm->tcpm_fastopen.cookie.len = 0;
}
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists