lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1300750536.26546.31.camel@lade.trondhjem.org>
Date:	Mon, 21 Mar 2011 19:35:36 -0400
From:	Trond Myklebust <Trond.Myklebust@...app.com>
To:	"J. Bruce Fields" <bfields@...ldses.org>
Cc:	Wolfgang Walter <wolfgang.walter@...m.de>,
	linux-kernel@...r.kernel.org, linux-nfs@...r.kernel.org
Subject: Re: problem with nfs4: rpciod seems to loop in rpc_shutdown_client
 forever

On Mon, 2011-03-21 at 19:28 -0400, J. Bruce Fields wrote:
> On Fri, Mar 18, 2011 at 11:49:21PM +0100, Wolfgang Walter wrote:
> > Hello,
> > 
> > I have a problem with our nfs-server (stable 2.6.32.33 but also with
> > .31 or .32 and probably older ones): sometimes
> > one or more rpciod get stuck. I used
> > 
> > 	rpcdebug -m rpc -s all
> > 
> > I get messages as the following one about every second:
> > 
> > Mar 18 11:15:37 au kernel: [44640.906793] RPC:       killing all tasks for client ffff88041c51de00
> > Mar 18 11:15:38 au kernel: [44641.906793] RPC:       killing all tasks for client ffff88041c51de00
> > Mar 18 11:15:39 au kernel: [44642.906795] RPC:       killing all tasks for client ffff88041c51de00
> > Mar 18 11:15:40 au kernel: [44643.906793] RPC:       killing all tasks for client ffff88041c51de00
> > Mar 18 11:15:41 au kernel: [44644.906795] RPC:       killing all tasks for client ffff88041c51de00
> > Mar 18 11:15:42 au kernel: [44645.906794] RPC:       killing all tasks for client ffff88041c51de00
> > 
> > and I get this messages:
> > 
> > Mar 18 22:45:57 au kernel: [86061.779008]   174 0381     -5 ffff88041c51de00   (null)        0 ffffffff817211a0 nfs4_cbv1 CB_NULL a:rpc_exit_task q:none
> > 
> > My theorie is this one:
> > 
> > * this async task is runnable but does not progress (calling rpc_exit_task).
> > * this is because the same rpciod which handles this task loops in
> >   rpc_shutdown_client waiting for this task to go away.
> > * because rpc_shutdown_client is called from an async rpc, too
> 
> Off hand I don't see any place where rpc_shutdown_client() is called
> from rpciod; do you?

The only case I could think of would be if we're still calling mntput()
from some RPC callback. In principle we should only be doing that from
the rpc_call_ops->rpc_callback() from within the nfsiod thread rather
than rpciod.

Is it possible this might be another instance of the nfs_commit_inode()
busy-loop?

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@...app.com
www.netapp.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ