[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MDEHLPKNGKAHNMBLJOLKOEDNAFAD.davids@webmaster.com>
Date: Fri, 10 Oct 2008 21:48:45 -0700
From: "David Schwartz" <davids@...master.com>
To: <linux-kernel@...r.kernel.org>
Subject: RE: recv() hangs until SIGCHLD ?
Nicolas Cannasse wrote:
> In some rare cases, one (or several) threads are hanging in recv().
> Both lsof and ls /proc/<pid>/fd show that the socket used is in
> ESTABLISHED mode but when checking on the host on which it's connected
> (a mysql DB) we can't find the corresponding client socket (as it's
> been closed already on the other side).
Blocking sockets will block until data is received. If no other thread is
sending data, this can block forever.
> We are using the Boehm GC which uses the signals SIGXCPU and SIGPWR to
> pause+restart the threads when running a GC cycle. We are correctly
> handling EINTR in send() and recv() by restarting the call in case
> they get interrupted this way.
>
> However, when attaching GDB to our locked thread it seems that even
> when the GC runs, recv() does not exit (the breakpoint after it is not
> reached). If we send SIGCHLD to the hanging thread with GDB, recv()
> does exit and the thread is correctly unlocked. If we don't, it will
> hang forever.
Why shouldn't it hang forever? What was supposed to wake it that's not?
> Any idea how we can stop this from happening or what additional things
> we can check to get more informations on what's occurring ?
You say a thread is hanging in receive and not returning. But you've yet to
explain why it should return. Was it interrupted by a signal? Was data
received? Is the socket non-blocking? Why isn't this expected behavior?
Blocking sockets block, full stop.
DS
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists