netdev - Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0805311044360.2189@wrl-59.cs.helsinki.fi>
Date:	Sat, 31 May 2008 13:05:26 +0300 (EEST)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	Evgeniy Polyakov <johnpol@....mipt.ru>
cc:	Ray Lee <ray-lk@...rabbit.org>, Ingo Molnar <mingo@...e.hu>,
	LKML <linux-kernel@...r.kernel.org>,
	Netdev <netdev@...r.kernel.org>,
	"David S. Miller" <davem@...emloft.net>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+

On Sat, 31 May 2008, Evgeniy Polyakov wrote:

> On Sat, May 31, 2008 at 12:11:32AM +0300, Ilpo Järvinen (ilpo.jarvinen@...sinki.fi) wrote:
> > > > > Active Internet connections (w/o servers)
> > > > > Proto Recv-Q Send-Q Local Address               Foreign Address
> > > > > State
> > > > > tcp        0 174592 10.0.1.14:58015             10.0.1.14:3632  ESTABLISHED
> > > > > tcp    72134      0 10.0.1.14:3632              10.0.1.14:58015 ESTABLISHED
> > 
> >              ^^^^^
> > 
> > Can you perhaps find/guess/think some explanation for this _receiver 
> > queue_...? This was a trick question :-), as we already know that the 
> > receiving process is no longer there and therefore obviously won't be 
> > reading anything anymore. But that opened another question, why TCP is 
> > then still in ESTABLISHED as orphaned TCP shouldn't be in establised state 
> > anymore, tcp_close should have changed the state (either at close or at 
> > process exit). I guess once it becomes known why tcp_close either wasn't 
> > called at all or it didn't change the state of the flow (it's quite 
> > simple, see for yourself), the cause of the bug is found (it might even be 
> > that the process went away when it shouldn't have, either a bookkeeping 
> > bug somewhere or real death, or something along those lines).
> 
> In that case pid Ingo straced was still valid and existed (ans stuck
> probably in tcp_sendpage() - sysrq-t dump was messy),

...No, I think it was the sender, not the receiver that Ingo straced!
A _correctly_ behaving TCP sender will just get "stuck" once the window
is exhausted. It as was proven that advertized window is zero (tcpdump).
So I find that hardly a bug.

> so process still exist, but sock somewhat lost its reference to socket.

...this is something that seems possible explanation.

> AFICS all (three)
> calls for sk_orphan() can not happen without state change to something
> closing (like fin wait or close), except some linger things and
> listening socket, which frees its connection requests.

Yes, this what I figured out as well (distcc didn't seem to play with 
linger, I checked, nor would it have any reason to)... The LISTENING 
socket was still there in /proc/net/tcp, so closing that won't explain it.

In addition, it seems that tcp_close would notify the other end as well, 
either with RST or with FIN, which would cause the other end to leave 
ESTABLISHED (unless tcp_transmit_skb would fail but I guess that would end 
up into some MIB in either of the cases). Here RST is the way to go 
because there's unread data (though in theory having yet another bug could 
allow building up the receiver queue afterwards).

Which then boils down to two alternatives:
  1) The receiving process lost its reference though the process 
     still exists and therefore cannot read
  2) tcp_close wasn't called when it should have, e.g., process exited
     but the tcp_close wasn't called because the reference wasn't there

Btw, Ingo also got this warning later on which might be helpful in 
understanding all this:

>>>>> after ~7 hours of uptime the networking code produced this assertion:
>>>>>
>>>>> [25441.140000] KERNEL: assertion (!sk->sk_ack_backlog) failed at 
>>>>>  net/ipv4/inet_connection_sock.c (642)

I guess that sk_ack_backlog was already visible at the time when the TCP 
got stuck because listening socket already had 1 there in /proc/net/tcp 
(though this depends if there was some other connection attempts still 
going on).

...Some new info (at least to me). I tried with ./stress_accept 5000 2 
(attached), I could easily produce a number of ESTABLISHED connections 
which won't yet have an owner in netstat (tested in 2.6.25.3) while 
connect.sh is running but it seems some kind of transitional state as once 
I do killall -TERM connect.sh (attached), they get the owning 
stress_accept. So third explanation would be some race during that 
transition that would allow process to never get the reference...

Before:
tcp    65436      0 lo:5000          lo:59980         ESTABLISHED -
After killall:
tcp    76444      0 lo:5000          lo:59980         ESTABLISHED 9013/stress_accept

-- 
 i.
View attachment "stress_accept.c" of type "TEXT/PLAIN" (1657 bytes)

Download attachment "connect.sh" of type "APPLICATION/x-sh" (85 bytes)