netdev - Re: bug in select(2) regarding non-blocking connect(2) completion?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1304770358.2821.1139.camel@edumazet-laptop>
Date:	Sat, 07 May 2011 14:12:38 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Michael Shuldman <michaels@...t.no>
Cc:	linux-kernel@...r.kernel.org,
	"David S. Miller" <davem@...emloft.net>, karls@...t.no,
	netdev <netdev@...r.kernel.org>
Subject: Re: bug in select(2) regarding non-blocking connect(2) completion?

Le samedi 07 mai 2011 à 12:51 +0200, Michael Shuldman a écrit :
> Hello, I am occasionally encountering what I belive is a bug in the
> kernel.
> 
> Below is a strace that I believe shows how the bug manifests itself,
> with my comments.
> 
> 
> # first select.  All fd's in the write set ([15 17 ... 51 55]) are 
> # non-blocking sockets that have had a connect(2) previously issued on
> # them, and which have yet to finish connecting as far as we know
> # at the time we call select(2).

We dont see the return from connect() : maybe the error was already
returned there.

Only EINPROGRESS is valid here (or fd should be closed right now)

> 03:55:31.808548 select(58, [4 8 11 12 13 14 16 18 19 20 21 22 23 24 26 27 30 31
> 32 33 34 35 36 37 39 40 41 43 44 46 48 49 50 52 53 54 57], [15 17 25 29 45 47 51
>  55], [11 12 13 14 16 18 19 20 21 22 23 24 26 27 30 31 32 33 34 35 36 37 39 40 4
> 1 43 44 46 48 49 50 52 53 54 57], {1, 0}) = 3 (in [16 26], out [51], left {1, 0}
> )
> 
> # As indicated by the results returned by the above select(2), fd 51 should
> # have finished the connect attempt, but when we try to find out whether 
> # the connect(2) succeeded or failed, the results are conflicting.
> 

If connect() attempt is rejected by remote peer, then select() says your
fd is 'writeable', in the sense you have the definitive answer to your
non blocking connect().

> 03:55:31.808622 getpeername(51, 0x7fff5d2eaa8c, [0]) = -1 ENOTCONN (Transport en
> dpoint is not connected)

This means end point is non connected : other peer sent RST or no answer
to SYN packets.


> 03:55:31.808900 getsockopt(51, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
> 

Hmm, interesting... Are you sure a previous call was not already done
(since this clears the error) ?

> # getpeername(2) failing on a socket that has finished connecting should 
> # indicate that the connect(2) failed.  Yet when we try to fetch the
> # SO_ERROR of the socket, it says no error is currently set.
> # We then loop around with select(2) again, and again the same thing
> # happens:
> 
> 03:55:31.809259 select(58, [4 8 11 12 13 14 16 18 19 20 21 22 23 24 26 27 30 31
> 32 33 34 35 36 37 39 40 41 43 44 46 48 49 50 52 53 54 57], [15 17 25 29 45 47 51
>  55], [11 12 13 14 16 18 19 20 21 22 23 24 26 27 30 31 32 33 34 35 36 37 39 40 4
> 1 43 44 46 48 49 50 52 53 54 57], {1, 0}) = 3 (in [16 26], out [51], left {1, 0}
> )
> 03:55:31.809329 getpeername(51, 0x7fff5d2eaa8c, [0]) = -1 ENOTCONN (Transport en
> dpoint is not connected)
> 03:55:31.809640 getsockopt(51, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
> 

Well, if you missed the original error report, all next getpeername()
and SO_ERROR will do the same, and select() says fd is ready for 'write'

> ...
> 
> # finally, getsockopt(2) returns that the connect(2) failed.
> 03:55:32.521146 getpeername(51, 0x7fff5d2eaa8c, [0]) = -1 ENOTCONN (Transport en
> dpoint is not connected)
> 03:55:32.521614 getsockopt(51, SOL_SOCKET, SO_ERROR, [101], [4]) = 0
> 
> In other words, select(2) says the socket has finished connecting,
> getpeername(2) neither confirms nor denies this (it can only confirm
> if the connect finished successfully).  getsockopt(2) and SO_ERROR
> however says there is no error on the socket, which coupled
> with getpeername(2) failing, indicates that the connect(2) has
> not yet finished
> 
> 
> 
> This does not happen all the time.  E.g., I watched the system for
> an hour yesterday, as things were staring up and the number of
> concurrent tcp clients gradually increased from zero to around 700,
> with no observable problems.  However after a while, the problem
> starts occurring, related to an increasing number of clients or
> something else, I do not know.
> 
> Currently the system has a little over 3,000 clients and the problem
> occurs now and then, sometimes several times a minute, while sometimes
> it can take dozens of minutes between each time.  At the moment,
> the last time the problem was detected was 40 minutes ago.
> 
> The software the above strace is related to is a proxy server, and
> if there are 3000 clients (incoming TCP sessions), there would
> normally be 3000 outgoing TCP sessions also.  
> 
> uname -a on the system in question reports 
> 2.6.18-238.9.1.el5 #1 SMP Tue Apr 12 18:10:13 EDT 2011 x86_64 x86_64
> x86_64 GNU/Linux
> 
> Thankful for any hints or pointers related to this problem.
> With kind regards,
> 

Make sure you dont miss an error in connect() system call.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html