netdev - Re: BUG: soft lockup - CPU#6 stuck for 22s! [httpd2-event:15597]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1346040367.2420.22.camel@edumazet-glaptop>
Date:	Sun, 26 Aug 2012 21:06:07 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Neal Cardwell <ncardwell@...gle.com>
Cc:	Cristian Rodríguez <crrodriguez@...nsuse.org>,
	Netdev <netdev@...r.kernel.org>,
	Yuchung Cheng <ycheng@...gle.com>
Subject: Re: BUG: soft lockup - CPU#6 stuck for 22s! [httpd2-event:15597]

On Mon, 2012-08-27 at 00:02 -0400, Neal Cardwell wrote:
> On Sat, Aug 25, 2012 at 7:47 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> > On Sat, 2012-08-25 at 11:14 +0200, Eric Dumazet wrote:
> >> From: Eric Dumazet <edumazet@...gle.com>
> >>
> >> On Sat, 2012-08-25 at 10:59 +0200, Eric Dumazet wrote:
> >> > On Fri, 2012-08-24 at 20:50 -0400, Cristian Rodríguez wrote:
> >> > > Hi, the issue I reported with IPV6 few weeks ago seems to be gone, but
> >> > > now I am getting the following crash..
> >>
> >> > Oh, I now see the bug, I'll send a patch asap
> >>
> >> Please try the following fix.
> >>
> >> Thanks !
> >
> > Well, this v2 seems cleaner :
> >
> > [PATCH v2] tcp: tcp_slow_start() should not decrease snd_cwnd
> >
> > Cristian Rodríguez reported various lockups in TCP stack,
> > introduced by commit 9dc274151a548 (tcp: fix ABC in tcp_slow_start())
> >
> > We could exit tcp_slow_start() with a zeroed snd_cwnd,
> > and next time we enter tcp_slow_start(), we run an infinite loop.
> >
> > Reported-by: Cristian Rodríguez <crrodriguez@...nsuse.org>
> > Cc: Yuchung Cheng <ycheng@...gle.com>
> > Cc: Neal Cardwell <ncardwell@...gle.com>
> > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > ---
> >  net/ipv4/tcp_cong.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
> > index 1432cdb..e656c72 100644
> > --- a/net/ipv4/tcp_cong.c
> > +++ b/net/ipv4/tcp_cong.c
> > @@ -337,7 +337,7 @@ void tcp_slow_start(struct tcp_sock *tp)
> >                 tp->snd_cwnd_cnt -= tp->snd_cwnd;
> >                 delta++;
> >         }
> > -       tp->snd_cwnd = min(tp->snd_cwnd + delta, tp->snd_cwnd_clamp);
> > +       tp->snd_cwnd = clamp(tp->snd_cwnd + delta, tp->snd_cwnd, tp->snd_cwnd_clamp);
> 
> AFAICT if tcp_slow_start() is changing snd_cwnd from non-zero to zero
> then this is because snd_cwnd_clamp is zero here, as you theorize may
> be happening to races somewhere.
> 
> However, AFAICT from reading the min() and clamp() macros, this code
> with clamp() will still have the same problem as the existing code
> that uses min: if snd_cwnd_clamp is 0 then snd_cwnd will end up 0
> here. (This is because the clamp() macro implicitly assumes that the
> max value is above the min value, and filters agains the max last.)
> 

Indeed, so the first patch was better...

Not sure I can investigate this problem this week, as I attend LKS/LPC
in San Diego.

Could be that snd_cwnd is zero as well so we have this infinite loop...

        while (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
                tp->snd_cwnd_cnt -= tp->snd_cwnd;
                delta++;
        }



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html