lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 16 Apr 2009 01:07:36 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Alexander Beregalov <a.beregalov@...il.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	linux-nfs@...r.kernel.org, netdev@...r.kernel.org
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Alessio Igor Bogani <abogani@...ware.it>,
	Jeff Mahoney <jeffm@...e.com>,
	ReiserFS Development List <reiserfs-devel@...r.kernel.org>,
	Chris Mason <chris.mason@...cle.com>
Subject: Re: [tree] latest kill-the-BKL tree, v12


* Alexander Beregalov <a.beregalov@...il.com> wrote:

> 2009/4/14 Ingo Molnar <mingo@...e.hu>:
> >
> > * Alexander Beregalov <a.beregalov@...il.com> wrote:
> >
> >> On Tue, Apr 14, 2009 at 05:34:22AM +0200, Frederic Weisbecker wrote:
> >> > Ingo,
> >> >
> >> > This small patchset fixes some deadlocks I've faced after trying
> >> > some pressures with dbench on a reiserfs partition.
> >> >
> >> > There is still some work pending such as adding some checks to ensure we
> >> > _always_ release the lock before sleeping, as you suggested.
> >> > Also I have to fix a lockdep warning reported by Alessio Igor Bogani.
> >> > And also some optimizations....
> >> >
> >> > Thanks,
> >> > Frederic.
> >> >
> >> > Frederic Weisbecker (3):
> >> >   kill-the-BKL/reiserfs: provide a tool to lock only once the write lock
> >> >   kill-the-BKL/reiserfs: lock only once in reiserfs_truncate_file
> >> >   kill-the-BKL/reiserfs: only acquire the write lock once in
> >> >     reiserfs_dirty_inode
> >> >
> >> >  fs/reiserfs/inode.c         |   10 +++++++---
> >> >  fs/reiserfs/lock.c          |   26 ++++++++++++++++++++++++++
> >> >  fs/reiserfs/super.c         |   15 +++++++++------
> >> >  include/linux/reiserfs_fs.h |    2 ++
> >> >  4 files changed, 44 insertions(+), 9 deletions(-)
> >> >
> >>
> >> Hi
> >>
> >> The same test - dbench on reiserfs on loop on sparc64.
> >>
> >> [ INFO: possible circular locking dependency detected ]
> >> 2.6.30-rc1-00457-gb21597d-dirty #2
> >
> > I'm wondering ... your version hash suggests you used vanilla
> > upstream as a base for your test. There's a string of other fixes
> > from Frederic in tip:core/kill-the-BKL branch, have you picked them
> > all up when you did your testing?
> >
> > The most coherent way to test this would be to pick up the latest
> > core/kill-the-BKL git tree from:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git core/kill-the-BKL
> >
> 
> I did not know about this branch, now I am testing it and there is 
> no more problem with that testcase (dbench).
> 
> I will continue testing.

thanks for testing it! It seems reiserfs with Frederic's changes 
appears to be more stable now on your system.

I saw your NFS circular locking kill-the-BKL problem report on LKML 
- also attached below.

Hopefully someone on the Cc: list with NFS experience can point out 
the BKL assumption that is causing this.

	Ingo

----- Forwarded message from Alexander Beregalov <a.beregalov@...il.com> -----

Date: Wed, 15 Apr 2009 22:08:01 +0400
From: Alexander Beregalov <a.beregalov@...il.com>
To: linux-kernel <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>, linux-nfs@...r.kernel.org
Subject: [core/kill-the-BKL] nfs3: possible circular locking dependency

Hi

I have pulled core/kill-the-BKL on top of 2.6.30-rc2.

device: '0:18': device_add

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.30-rc2-00057-g30aa902-dirty #5
-------------------------------------------------------
mount.nfs/1740 is trying to acquire lock:
 (kernel_mutex){+.+.+.}, at: [<00000000006f32dc>] lock_kernel+0x28/0x3c

but task is already holding lock:
 (&type->s_umount_key#24/1){+.+.+.}, at: [<00000000004b88a0>] sget+0x228/0x36c

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&type->s_umount_key#24/1){+.+.+.}:
       [<00000000004776d0>] lock_acquire+0x5c/0x74
       [<0000000000469f5c>] down_write_nested+0x38/0x50
       [<00000000004b88a0>] sget+0x228/0x36c
       [<00000000005688fc>] nfs_get_sb+0x80c/0xa7c
       [<00000000004b7ec8>] vfs_kern_mount+0x44/0xa4
       [<00000000004b7f84>] do_kern_mount+0x30/0xcc
       [<00000000004cf300>] do_mount+0x7c8/0x80c
       [<00000000004ed2a4>] compat_sys_mount+0x224/0x274
       [<0000000000406154>] linux_sparc_syscall32+0x34/0x40

-> #0 (kernel_mutex){+.+.+.}:
       [<00000000004776d0>] lock_acquire+0x5c/0x74
       [<00000000006f0ebc>] mutex_lock_nested+0x48/0x380
       [<00000000006f32dc>] lock_kernel+0x28/0x3c
       [<00000000006d20ec>] rpc_wait_bit_killable+0x64/0x8c
       [<00000000006f0620>] __wait_on_bit+0x64/0xc0
       [<00000000006f06e4>] out_of_line_wait_on_bit+0x68/0x7c
       [<00000000006d2938>] __rpc_execute+0x150/0x2b4
       [<00000000006d2ac0>] rpc_execute+0x24/0x34
       [<00000000006cc338>] rpc_run_task+0x64/0x74
       [<00000000006cc474>] rpc_call_sync+0x58/0x7c
       [<00000000005717b0>] nfs3_rpc_wrapper+0x24/0xa0
       [<0000000000572024>] do_proc_get_root+0x6c/0x10c
       [<00000000005720dc>] nfs3_proc_get_root+0x18/0x5c
       [<000000000056401c>] nfs_get_root+0x34/0x17c
       [<0000000000568adc>] nfs_get_sb+0x9ec/0xa7c
       [<00000000004b7ec8>] vfs_kern_mount+0x44/0xa4
       [<00000000004b7f84>] do_kern_mount+0x30/0xcc
       [<00000000004cf300>] do_mount+0x7c8/0x80c
       [<00000000004ed2a4>] compat_sys_mount+0x224/0x274
       [<0000000000406154>] linux_sparc_syscall32+0x34/0x40

other info that might help us debug this:

1 lock held by mount.nfs/1740:
 #0:  (&type->s_umount_key#24/1){+.+.+.}, at: [<00000000004b88a0>]
sget+0x228/0x36c

stack backtrace:
Call Trace:
 [00000000004755ac] print_circular_bug_tail+0xfc/0x10c
 [0000000000476e24] __lock_acquire+0x12f0/0x1b40
 [00000000004776d0] lock_acquire+0x5c/0x74
 [00000000006f0ebc] mutex_lock_nested+0x48/0x380
 [00000000006f32dc] lock_kernel+0x28/0x3c
 [00000000006d20ec] rpc_wait_bit_killable+0x64/0x8c
 [00000000006f0620] __wait_on_bit+0x64/0xc0
 [00000000006f06e4] out_of_line_wait_on_bit+0x68/0x7c
 [00000000006d2938] __rpc_execute+0x150/0x2b4
 [00000000006d2ac0] rpc_execute+0x24/0x34
 [00000000006cc338] rpc_run_task+0x64/0x74
 [00000000006cc474] rpc_call_sync+0x58/0x7c
 [00000000005717b0] nfs3_rpc_wrapper+0x24/0xa0
 [0000000000572024] do_proc_get_root+0x6c/0x10c
 [00000000005720dc] nfs3_proc_get_root+0x18/0x5c
 [000000000056401c] nfs_get_root+0x34/0x17c
device: '0:19': device_add

----- End forwarded message -----
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ