netdev - listen(2) backlog changes in or around Linux 3.1?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJgzZorigejCuFweNrvmkEJts3Um7exh1fYTH=4KrEcB=v=2SA@mail.gmail.com>
Date:	Fri, 12 Oct 2012 16:40:40 -0700
From:	enh <enh@...gle.com>
To:	netdev@...r.kernel.org
Subject: listen(2) backlog changes in or around Linux 3.1?

i used to use the following hack to unit test connect timeouts: i'd
call listen(2) on a socket and then deliberately connect (backlog + 3)
sockets without accept(2)ing any of the connections. (why 3? because
Stevens told me so, and experiment backed him up. see figure 4.10 in
his UNIX Network Programming.)

with "old" kernels, 2.6.35-ish to 3.0-ish, this worked great. my next
connect(2) to the same loopback port would hang indefinitely. i could
even unblock the connect by calling accept(2) in another thread. this
was awesome for testing.

in 3.1 on ARM, 3.2 on x86 (Ubuntu desktop), and 3.4 on ARM, this no
longer works. it doesn't seem to be as simple as "the constant is no
longer 3". my tests are now flaky. sometimes they work like they used
to, and sometimes an extra connect(2) will succeed. (or, if i'm in
non-blocking mode, my poll(2) will return with the non-blocking socket
that's trying to connect now ready.)

i'm guessing if this changed in 3.1 and is still changed in 3.4,
whatever's changed wasn't an accident. but i haven't been able to find
the right search terms to RTFM. i also finally got around to grepping
the kernel for the "+ 3", but wasn't able to find that. (so i'd be
interested to know where the old behavior came from too.)

my least worst workaround at the moment is to use one of RFC5737's
test networks, but that requires that the device have a network
connection, otherwise my connect(2)s fail immediately with
ENETUNREACH, which is no use to me. also, unlike my old trick, i've
got no way to suddenly "unblock" a slow connect(2) (this is useful for
unit testing the code that does the poll(2) part of the usual
connect-with-timeout implementation).
https://android-review.googlesource.com/#/c/44563/

hopefully someone here can shed some light on this? ideally someone
will have a workaround as good as my old trick. i realize i was
relying on undocumented behavior, and i'm happy to have to check
/proc/version and behave appropriately, but i'd really like a way to
keep my unit tests!

thanks,
 elliott
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html