linux-kernel - Re: [2.6.38-3.x] [BUG] soft lockup - CPU#X stuck for 23s! (vfs, autofs, vserver)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFztBvfGc+AWvTXhaxj54Ow6O-wHJnjr3T-2_BU5cd2EDw@mail.gmail.com>
Date:	Mon, 24 Sep 2012 10:22:11 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Paweł Sikora <pluto@...-linux.org>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	arekm@...-linux.org, baggins@...-linux.org
Subject: Re: [2.6.38-3.x] [BUG] soft lockup - CPU#X stuck for 23s! (vfs,
 autofs, vserver)

On Mon, Sep 24, 2012 at 4:23 AM, Herbert Poetzl <herbert@...hfloor.at> wrote:
>
> currently we do:
>
>         br_read_lock(&vfsmount_lock);
>         root = current->fs->root;
>         root_mnt = real_mount(root.mnt);
>         point = root.dentry;
>
>         while ((mnt != mnt->mnt_parent) && (mnt != root_mnt)) {
>                 point = mnt->mnt_mountpoint;
>                 mnt = mnt->mnt_parent;
>         }
>
>         ret = (mnt == root_mnt) && is_subdir(point, root.dentry);
>         br_read_unlock(&vfsmount_lock);
>
> and we have been considering to move the br_read_unlock()
> right before the is_subdir() call

So the read lock itself should not cause problems. We have tons of
high-frequency read-lockers, over quite big areas.

But exactly because the readlockers are so high-frequency, I'd expect
any problems to be *found* by read-lockers.

But the *cause* would likely be
 - missing unlocks
 - write-locks
 - recursion on the locks (including read-locks: they are actually
per-cpu spinlocks)

but I'd have expected lockdep to find anything obvious like that.

If the locking itself is fine, maybe the loop above (or the
is_subdir()) is infinite due to mnt->mnt_parent somehow becoming a
circular loop. Maybe due to corrupt data structures traversed inside
the lock, causing infinite loops..

I really don't know the vserver patches, it would be much more
interesting if you can recreate the problems using a standard kernel.

                          Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/