netdev - why doesn't close() wake a blocked read() --- and what should i do about it?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <AANLkTin4ugMqekRX389XxPx_R8Nu2cWYCLYtpU5rSrsU@mail.gmail.com>
Date:	Fri, 9 Jul 2010 16:38:43 -0700
From:	enh <enh@...gle.com>
To:	netdev@...r.kernel.org
Subject: why doesn't close() wake a blocked read() --- and what should i do 
	about it?

on Android (Linux 2.6.32), if one thread is in accept(2) and another
thread calls close(2) on that socket, the first thread returns with an
error. likewise if the first thread is in recv(2) waiting for a
datagram packet. but if the first thread is just doing a regular
read(2), it's not woken. (similarly for write(2).) this is unfortunate
for me, because that's not how Java is supposed to work
(http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4344135); calling
Socket.close from another thread is the way you're supposed to unblock
such things. apps need this so they can free up threads. (and i'm
trying to fix the VM so apps can do this.)

it's well documented on the web that this is how linux behaves, though
i didn't find any explanation of why, or why there's the discrepancy
between read(2) and other similar operations. or is this just a bug?

on the assumption that this is something i need to work around in
userspace, what's my best choice? i can pthread_kill the stuck
threads, but that means they need to be somewhat aware of this hack.
if they just TEMP_FAILURE_RETRY, there's a race condition if the fd
has been reopened as something else between the close(2) and their
retry. it also means i'm out of luck for code i don't control, such as
openssl: that code is always going to be open to the race.

alternatively, i could have a pipe per thread, select(2) on both the
pipe and the socket rather than read(2) directly, and then examine my
fd_sets to see whether it's time to read or time to give up, but that
seems unnecessarily resource-intensive, and also doesn't address the
openssl case.

alternatively, there's shutdown(2), but i'm not sure that really does
what i want either. for one thing, it's not clear what the interaction
between shutdown(2) and SO_LINGER is (i'm thinking of write(2) here).
i also find my unblocked read(2) returns 0 rather than -1 and some
recognizable value of errno -- which isn't unreasonable in general,
just for this particular use  -- so i'd need to do some bookkeeping to
check what that 0 really means.

any comments on these ideas, or other ideas i haven't thought of?

thanks,
 --elliott
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html