lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 25 Nov 2008 08:28:16 -0500
From:	Trond Myklebust <trond.myklebust@....uio.no>
To:	Ian Campbell <ijc@...lion.org.uk>
Cc:	linux-nfs@...r.kernel.org, Max Kellermann <mk@...all.com>,
	linux-kernel@...r.kernel.org, gcosta@...hat.com,
	Grant Coady <grant_lkml@...o.com.au>,
	"J. Bruce Fields" <bfields@...ldses.org>,
	Tom Tucker <tom@...ngridcomputing.com>
Subject: Re: [PATCH] NFS regression in 2.6.26?, "task blocked for more than
 120 seconds"

On Tue, 2008-11-25 at 07:09 +0000, Ian Campbell wrote:
> On Sat, 2008-11-01 at 09:41 -0400, Trond Myklebust wrote:
> > On Sat, 2008-11-01 at 11:45 +0000, Ian Campbell wrote:
> > > On Mon, 2008-10-20 at 07:27 +0100, Ian Campbell wrote:
> > > > So far I have bisected down to this range and am currently testing
> > > > acee478 which has been up for >4days.
> > > 
> > > Another update. It has now bisected down to a small range 
> > > 
> > > 7272dcd31d56580dee7693c21e369fd167e137fe SUNRPC: xprt_autoclose() should not call xprt_disconnect()
> > > e06799f958bf7f9f8fae15f0c6f519953fb0257c SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
> > > ef80367071dce7d2533e79ae8f3c84ec42708dc8 SUNRPC: TCP clear XPRT_CLOSE_WAIT when the socket is closed for writes
> > > 3b948ae5be5e22532584113e2e02029519bbad8f SUNRPC: Allow the client to detect if the TCP connection is closed
> > > 67a391d72ca7efb387c30ec761a487e50a3ff085 SUNRPC: Fix TCP rebinding logic
> > > 66af1e558538137080615e7ad6d1f2f80862de01 SUNRPC: Fix a race in xs_tcp_state_change()
> > > 
> > > I'm currently testing 3b948ae5be5e22532584113e2e02029519bbad8f.
> > > 
> > > 7272dcd31d56580dee7693c21e369fd167e137fe repro'd in half a day while
> > > ef818a28fac9bd214e676986d8301db0582b92a9 (parent of
> > > 66af1e558538137080615e7ad6d1f2f80862de01) survived for 7 days.
> 
> According to bisect:
> 
> e06799f958bf7f9f8fae15f0c6f519953fb0257c is first bad commit
> commit e06799f958bf7f9f8fae15f0c6f519953fb0257c
> Author: Trond Myklebust <Trond.Myklebust@...app.com>
> Date:   Mon Nov 5 15:44:12 2007 -0500
> 
>     SUNRPC: Use shutdown() instead of close() when disconnecting a TCP socket
>     
>     By using shutdown() rather than close() we allow the RPC client to wait
>     for the TCP close handshake to complete before we start trying to reconnect
>     using the same port.
>     We use shutdown(SHUT_WR) only instead of shutting down both directions,
>     however we wait until the server has closed the connection on its side.
>     
>     Signed-off-by: Trond Myklebust <Trond.Myklebust@...app.com>
> 
> I've started testing 2.6.26 + revert. It's been a long while since I
> started this process so I'll also have a go at an up to date version.
> 
> Cheers,

That would indicate that the server is failing to close the TCP
connection when the client closes on its end.

Could you remind me what server you are using? Also, does 'netstat -t'
show connections that are stuck in the CLOSE_WAIT state when you see the
hang?

Trond

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ