linux-kernel - Re: [patch 11/33] fs: dcache scale subdirs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1277127322.1875.516.camel@laptop>
Date:	Mon, 21 Jun 2010 15:35:22 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Nick Piggin <npiggin@...e.de>
Cc:	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	john stultz <johnstul@...ibm.com>,
	John Kacur <jkacur@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [patch 11/33] fs: dcache scale subdirs

On Fri, 2010-06-18 at 02:53 +1000, Nick Piggin wrote:

> > Right, so this isn't going to work well, this dentry recursion is
> > basically unbounded afaict, so the 2nd subdir will also be locked using
> > DENRTY_D_LOCKED_NESTED, resulting in the 1st and 2nd subdir both having
> > the same (sub)class and lockdep doesn't like that much.
> 
> No it's a bit of a trucky loop, but it is not unbounded. It takes the
> parent, then the child, then it may continue again with the child as
> the new parent but in that case it drops the parent lock and tricks
> lockdep into not barfing.

Ah, indeed the thing you pointed out below should work.

> > Do we really need to keep the whole path locked? One of the comments
> > seems to suggest we could actually drop some locks and re-acquire.
> 
> As far as I can tell, RCU should be able to cover it without taking more
> than 2 locks at a time. John saw some issues in the -rt tree (I haven't
> reproduced yet) so he's locking the full chains there but I hope that
> won't be needed.

Right, so I was staring at the -rt splat, so its John who created that
wreckage?

static int select_parent(struct dentry * parent)
{
	struct dentry *this_parent;
	struct list_head *next;
	unsigned seq;
	int found;

rename_retry:
	found = 0;
	this_parent = parent;
	seq = read_seqbegin(&rename_lock);

	spin_lock(&this_parent->d_lock);
repeat:
	next = this_parent->d_subdirs.next;
resume:
	while (next != &this_parent->d_subdirs) {
		struct list_head *tmp = next;
		struct dentry *dentry = list_entry(tmp, struct dentry, d_u.d_child);
		next = tmp->next;

		spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
		dentry_lru_del_init(dentry);
		/* 
		 * move only zero ref count dentries to the end 
		 * of the unused list for prune_dcache
		 */
		if (!atomic_read(&dentry->d_count)) {
			dentry_lru_add_tail(dentry);
			found++;
		}

		/*
		 * We can return to the caller if we have found some (this
		 * ensures forward progress). We'll be coming back to find
		 * the rest.
		 */
		if (found && need_resched()) {
			spin_unlock(&dentry->d_lock);
			goto out;
		}

		/*
		 * Descend a level if the d_subdirs list is non-empty.
		 * Note that we keep a hold on the parent lock while
		 * we descend, so we don't have to reacquire it on
		 * ascend.
		 */
		if (!list_empty(&dentry->d_subdirs)) {
			this_parent = dentry;
			goto repeat;
		}

		spin_unlock(&dentry->d_lock);
	}
	/*
	 * All done at this level ... ascend and resume the search.
	 */
	if (this_parent != parent) {
		struct dentry *tmp;
		struct dentry *child;

		tmp = this_parent->d_parent;
		child = this_parent;
		next = child->d_u.d_child.next;
		spin_unlock(&this_parent->d_lock);
		this_parent = tmp;
		goto resume;
	}

out:
	/* Make sure we unlock all the way back up the tree */
	while (this_parent != parent) {
		struct dentry *tmp = this_parent->d_parent;
		spin_unlock(&this_parent->d_lock);
		this_parent = tmp;
	}
	spin_unlock(&this_parent->d_lock);
	if (read_seqretry(&rename_lock, seq))
		goto rename_retry;
	return found;
}


> > >                 /*
> > >                  * Descend a level if the d_subdirs list is non-empty.
> > >                  */
> > >                 if (!list_empty(&dentry->d_subdirs)) {
> > > +                       spin_unlock(&this_parent->d_lock);
> > > +                       spin_release(&dentry->d_lock.dep_map, 1, _RET_IP_);
> > >                         this_parent = dentry;
> > > +                       spin_acquire(&this_parent->d_lock.dep_map, 0, 1, _RET_IP_);
> > >                         goto repeat;
> 
>                             ^^^ That's what we do when descending.

You can write that as:
  lock_set_subclass(&this_parent->d_lock.dep_map, 0, _RET_IP_);

See kernel/sched.c:double_unlock_balance().



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/