lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081020101549.GH2811@fc6222126.aspadmin.net>
Date:	Mon, 20 Oct 2008 05:15:49 -0500
From:	swivel@...lls.gnugeneration.com
To:	Nicolas Cannasse <ncannasse@...ion-twin.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: poll() blocked / packets not received ?

On Mon, Oct 20, 2008 at 10:25:10AM +0200, Nicolas Cannasse wrote:
> Hello,
> 
> We have an application that uses pthreads and (blocking) sockets.
> 
> When the application runs with one single thread in separate processes 
> (using fork()) we don't get any problem.
> 
> However when it's multithreaded, we sometimes get stuck while poll()ing 
> a socket (with events set to POLLIN). Even after the other side of the 
> connection has closed its side of the connection, we are still stuck 
> here. Adding a timeout only makes the poll() exit with 0, so we loop.
> 
> In case we don't loop the next operation is a recv() which will block as 
> well (which is consistent).
> 
> It seems like nothing is longer received on the socket but it's 
> difficult to verify with tcpdump since our server outputs something like 
> 15MB at peek time with 150 hits per seconds.
> 
> We have Shorewall installed and enabled, but what seems strange is that 
> the problem depends on multithreading. It also occurs much more often on 
> the 4 core machines than on a 2 core ones (both with Hyperthreading 
> activated). We're using kernel 2.6.20-15-server (#2 SMP) provided by Ubuntu.
> 
> Any tip on we could fix that or investigate further would be 
> appreciated. After one month of debugging we're really out of solution now.
> 
> Best,
> Nicolas

Your usage pattern is a very common one, I highly doubt you are experiencing
a kernel bug here or many people (including myself) would be complaining.

Shorewall sounds like it might be suspect, are FIN's not coming in when the
remote closes?  You can look in the output of netstat to see what state the
TCP is in, still ESTABLISHED?

Have you tried just disabling the firewall to see if the problem
disappears?

Regards,
Vito Caputo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ