lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 4 Jun 2012 00:28:20 +0100
From:	Al Viro <viro@...IV.linux.org.uk>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Dave Jones <davej@...hat.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: processes hung after sys_renameat, and 'missing' processes

On Mon, Jun 04, 2012 at 12:17:09AM +0100, Al Viro wrote:
> 
> > Also, sysrq-w is usually way more interesting than 't' when there are
> > processes stuck on a mutex.
> > 
> > Because yes, it looks like you have a boattload of trinity processes
> > stuck on an inode mutex. Looks like every single one of them is in
> > 'lock_rename()'. It *shouldn't* be an ABBA deadlock, since lockdep
> > should have noticed that, but who knows.
> 
> lock_rename() is a bit of a red herring here - they appear to be all
> within-directory renames, so it's just a "trying to rename something
> in a directory that has ->i_mutex held by something else".
> 
> IOW, something else in there is holding ->i_mutex - something that
> either hadn't been through lock_rename() at all or has already
> passed through it and still hadn't got around to unlock_rename().
> In either case, suspects won't have lock_rename() in the trace...

Everything in lock_rename() appears to be at lock_rename+0x3e.  Unless
there's a really huge amount of filesystems on that box, this has to
be
                mutex_lock_nested(&p1->d_inode->i_mutex, I_MUTEX_PARENT);
and everything on that sucker is not holding any locks yet.  IOW, that's
the tail hanging off whatever deadlock is there.

One possibility is that something has left the kernel without releasing
i_mutex on some directory, which would make atomic_open patches the most
obvious suspects.

Which kernel it is and what filesystems are there?  Is there nfsd anywhere
in the mix?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ