lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 23 Feb 2018 17:42:16 +0000
From:   Al Viro <viro@...IV.linux.org.uk>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     linux-fsdevel@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Christoph Hellwig <hch@....de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 6/6] fs/dcache: Avoid remaining try_lock loop in
 shrink_dentry_list()

On Fri, Feb 23, 2018 at 03:09:28PM +0000, Al Viro wrote:
> You are conflating the "we have a reference" cases with this one, and
> they are very different.  Note, BTW, that had we raced with somebody
> else grabbing a reference, we would've quietly dropped dentry from
> the shrink list; what if we do the following: just after checking that
> refcount is not positive, do
> 	inode = dentry->d_inode;
> 	if unlikely(inode && !spin_trylock...)
> 		rcu_read_lock
> 		drop ->d_lock
> 		grab inode->i_lock
> 		grab ->d_lock
> 		if unlikely(dentry->d_inode != inode)
> 			drop inode->i_lock
> 			rcu_read_unlock
> 			if !killed
> 				drop ->d_lock
> 				drop parent's ->d_lock
> 				continue;
> 		else
> 			rcu_read_unlock
> *before* going into
>                 if (unlikely(dentry->d_flags & DCACHE_DENTRY_KILLED)) {
>                         bool can_free = dentry->d_flags & DCACHE_MAY_FREE;
>                         spin_unlock(&dentry->d_lock);
> 			...
> part?

Owww....  It's actually even nastier than I realized - dropping ->d_lock
opens us to having the sucker freed by dput() from another thread here.
IOW, between d_shrink_del(dentry) and __dentry_kill(dentry) dropping ->d_lock
is dangerous...

It's really very different from all other cases, and the trickiest by far.

FWIW, my impression from the series:
	1) dentry_kill() should deal with trylock failures on its own, leaving
the callers only the real "we need to drop the parent" case.  See upthread for
one variant of doing that.
	2) switching parent eviction in shrink_dentry_list() to dentry_kill()
is fine.
	3) for d_delete() trylock loop is wrong; however, it does not need
anything more elaborate than
{
        struct inode *inode;
        int isdir = d_is_dir(dentry);
        /*
         * Are we the only user?
         */
        spin_lock(&dentry->d_lock);
        if (dentry->d_lockref.count != 1)
		goto Shared;

        inode = dentry->d_inode;
	if (unlikely(!spin_trylock(&inode->i_lock))) {
		spin_unlock(&dentry->d_lock);
		spin_lock(&inode->i_lock);
		spin_lock(&dentry->d_lock);
		if (dentry->d_lockref.count != 1) {
			spin_unlock(&inode->i_lock);
			goto Shared;
		}
	}
           
	dentry->d_flags &= ~DCACHE_CANT_MOUNT;
	dentry_unlink_inode(dentry);
	fsnotify_nameremove(dentry, isdir);
	return;

Shared:	/* can't make it negative, must unhash */
        if (!d_unhashed(dentry))
                __d_drop(dentry);
        spin_unlock(&dentry->d_lock);

        fsnotify_nameremove(dentry, isdir);
}

If not an outright "lock inode first from the very beginning" - note that
inode is stable (and non-NULL) here.  IOW, that needs to be compared with
{
        struct inode *inode = dentry->d_inode;
        int isdir = d_is_dir(dentry);
        spin_lock(&inode->i_lock);
        spin_lock(&dentry->d_lock);
        /*
         * Are we the only user?
         */
        if (dentry->d_lockref.count == 1) {
		dentry->d_flags &= ~DCACHE_CANT_MOUNT;
		dentry_unlink_inode(dentry);
	} else {
		if (!d_unhashed(dentry))
			__d_drop(dentry);
		spin_unlock(&dentry->d_lock);
		spin_unlock(&inode->i_lock);
	}
	fsnotify_nameremove(dentry, isdir);
}

That costs an extra boinking the ->i_lock in case dentry is shared, but it's
much shorter and simpler that way.  Needs profiling; if the second variant
does not give worse performance, I would definitely prefer that one.
	4) the nasty one - shrink_dentry_list() evictions of zero-count dentries.
_That_ calls for careful use of RCU, etc. - none of the others need that.  Need
to think how to deal with that sucker; in any case, I do not believe that sharing
said RCU use, etc. with any other cases would do anything other than obfuscating
the rest.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ