lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 10 Sep 2013 10:33:52 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Josh Boyer <jwboyer@...oraproject.org>
Cc:	Al Viro <viro@...iv.linux.org.uk>,
	Waiman Long <Waiman.Long@...com>,
	"Linux-Kernel@...r. Kernel. Org" <linux-kernel@...r.kernel.org>,
	moneta.mace@...il.com
Subject: Re: kernel BUG at fs/dcache.c:648! with v3.11-7890-ge5c832d

On Tue, Sep 10, 2013 at 10:14 AM, Josh Boyer <jwboyer@...oraproject.org> wrote:
>
> We've had a user report a backtrace from hitting the
> BUG_ON(!ret->d_lockref.count) added with the lockref infrastructure
> (commit 98474236f72) on rawhide today[1].  I've grabbed the backtrace
> below.  The user has btrfs, NFS, and sshfs in usage with this oops.
>
> I've not seen anything similar, but I could have missed it.  Does this
> look familiar to anyone?

Nope. And the dget_parent() case itself hasn't even changed - that
BUG_ON() wasn't really added by the lockref code, it's just a
search-and-replace change of a BUG_ON(!d_count) to
BUG_ON(!d_lockref.count). The BUG_ON() existed before.

That whole "dget_parent()" thing is also in the _simple_ case (not RCU
mode), and the BUG_ON is for when the dentry is properly locked, so
that's all "safe" code. The refcount must have gotten corrupted
earlier.

Do you have the mainline git ID of that rawhide kernel? Because there
*was* a real bug in d_rcu_to_refcount. I don't see how it could
trigger that particular issue, but it could trigger scheduling while
in the rcu-protected region and that in turn could result in odd
things down the line, so..

That particular bug exists between commits 15570086b590 ("vfs:
reimplement d_rcu_to_refcount() using lockref_get_or_lock()") that
introduced it, and e5c832d55588 ("vfs: fix dentry RCU to refcounting
possibly sleeping dput()") that should have fixed it. But I don't know
what mainline kernel that "kernel-3.12.0-0.rc0.git16.2.fc21.x86_64" is
based on. I'm sure that information exists somewhere..

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists