lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070316033912.8780a9cd.akpm@linux-foundation.org>
Date:	Fri, 16 Mar 2007 03:39:12 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Folkert van Heusden <folkert@...heusden.com>,
	linux-kernel@...r.kernel.org, Oleg Nesterov <oleg@...sign.ru>,
	Neil Brown <neilb@...e.de>
Subject: Re: [2.6.20] BUG: workqueue leaked lock

On Fri, 16 Mar 2007 09:41:20 +0100 Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:

> On Thu, 2007-03-15 at 11:06 -0800, Andrew Morton wrote:
> > > On Tue, 13 Mar 2007 17:50:14 +0100 Folkert van Heusden <folkert@...heusden.com> wrote:
> > > ...
> > > [ 1756.728209] BUG: workqueue leaked lock or atomic: nfsd4/0x00000000/3577
> > > [ 1756.728271]     last function: laundromat_main+0x0/0x69 [nfsd]
> > > [ 1756.728392] 2 locks held by nfsd4/3577:
> > > [ 1756.728435]  #0:  (client_mutex){--..}, at: [<c1205b88>] mutex_lock+0x8/0xa
> > > [ 1756.728679]  #1:  (&inode->i_mutex){--..}, at: [<c1205b88>] mutex_lock+0x8/0xa
> > > [ 1756.728923]  [<c1003d57>] show_trace_log_lvl+0x1a/0x30
> > > [ 1756.729015]  [<c1003d7f>] show_trace+0x12/0x14
> > > [ 1756.729103]  [<c1003e79>] dump_stack+0x16/0x18
> > > [ 1756.729187]  [<c102c2e8>] run_workqueue+0x167/0x170
> > > [ 1756.729276]  [<c102c437>] worker_thread+0x146/0x165
> > > [ 1756.729368]  [<c102f797>] kthread+0x97/0xc4
> > > [ 1756.729456]  [<c1003bdb>] kernel_thread_helper+0x7/0x10
> > > [ 1756.729547]  =======================
> > > [ 1792.436492] svc: unknown version (0 for prog 100003, nfsd)
> > > [ 1846.683648] BUG: workqueue leaked lock or atomic: nfsd4/0x00000000/3577
> > > [ 1846.683701]     last function: laundromat_main+0x0/0x69 [nfsd]
> > > [ 1846.683832] 2 locks held by nfsd4/3577:
> > > [ 1846.683885]  #0:  (client_mutex){--..}, at: [<c1205b88>] mutex_lock+0x8/0xa
> > > [ 1846.683980]  #1:  (&inode->i_mutex){--..}, at: [<c1205b88>] mutex_lock+0x8/0xa
> > > [ 1846.683988]  [<c1003d57>] show_trace_log_lvl+0x1a/0x30
> > > [ 1846.683994]  [<c1003d7f>] show_trace+0x12/0x14
> > > [ 1846.683997]  [<c1003e79>] dump_stack+0x16/0x18
> > > [ 1846.684001]  [<c102c2e8>] run_workqueue+0x167/0x170
> > > [ 1846.684006]  [<c102c437>] worker_thread+0x146/0x165
> > > [ 1846.684012]  [<c102f797>] kthread+0x97/0xc4
> > > [ 1846.684023]  [<c1003bdb>] kernel_thread_helper+0x7/0x10
> > 
> > Oleg, that's a fairly incomprehensible message we have in there.  Can you
> > please explain what it means?
> 
> I think I'm responsible for this message (commit
> d5abe669172f20a4129a711de0f250a4e07db298); what is says is that the
> function executed by the workqueue (here laundromat_main) leaked an
> atomic context or is still holding locks (2 locks in this case).
> 

OK.  That's not necessarily a bug: one could envisage a (weird) piece of
code which takes a lock then releases it on a later workqueue invokation. 
But I'm not sure that nfs4_laundromat() is actually supposed to be doing
anything like that.

Then again, maybe it is: it seems to be waddling through a directory under
the control of a little state machine, with timeouts.

Neil: help?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ