lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 3 Oct 2013 20:43:51 +0100
From:	Al Viro <viro@...IV.linux.org.uk>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 17/17] RCU'd vfsmounts

On Thu, Oct 03, 2013 at 12:06:04PM -0700, Linus Torvalds wrote:
> On Thu, Oct 3, 2013 at 10:44 AM, Al Viro <viro@...iv.linux.org.uk> wrote:
> >
> > Anyway, I've done nicer variants of that protection for everything except
> > fuse (hadn't gotten around to it yet); see vfs.git#experimental now:
> 
> Ok, I did a quick test, and it looks ok here, so looking good for 3.13.
> 
> However, the new smp_mb() in mntput_no_expire() is quite noticeable in
> the path lookup stress-test profiles. And I'm not seeing what that
> allegedly protects against, especially if mnt_ns is NULL (ie all the
> common important cases).

In the common case it's ->mnt_ns is *not* NULL; that's what we get if
the damn thing is still mounted.

What we need to avoid is this:

mnt_ns non-NULL, mnt_count is 2
CPU1: umount -l					CPU2: mntput
umount_tree() clears mnt_ns
drop mount_lock.lock
namespace_unlock() calls mntput()
decrement mnt_count
see that mnt_ns is NULL
grab mount_lock.lock
check mnt_count
						decrement mnt_count
						see old value of mnt_ns
						decide to bugger off
see it equal to 1 (i.e. miss decrement on CPU2)
decide to bugger off

The barrier in mntput() is to prevent that combination, so that either CPU2
would see mnt_ns cleared by CPU1, or CPU1 would see mnt_count decrement done
by CPU2.  Its counterpart on CPU1 is provided by spin_unlock/spin_lock we've
done between clearing mnt_ns and checking mnt_count.  Note that
synchronize_rcu() in namespace_unlock() and rcu_read_lock() in mntput() are
irrelevant here - the latter on CPU2 might very well have happened after the
former on CPU1, so umount -l did *not* wait for CPU2 to do anything.

Any suggestions re getting rid of that barrier?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ