[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160607004058.GH14480@ZenIV.linux.org.uk>
Date: Tue, 7 Jun 2016 01:40:58 +0100
From: Al Viro <viro@...IV.linux.org.uk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Dave Hansen <dave.hansen@...el.com>,
"Chen, Tim C" <tim.c.chen@...el.com>,
Ingo Molnar <mingo@...hat.com>,
Davidlohr Bueso <dbueso@...e.de>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Jason Low <jason.low2@...com>,
Michel Lespinasse <walken@...gle.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Waiman Long <waiman.long@...com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: performance delta after VFS i_mutex=>i_rwsem conversion
On Mon, Jun 06, 2016 at 04:50:59PM -0700, Linus Torvalds wrote:
>
>
> On Mon, 6 Jun 2016, Al Viro wrote:
> >
> > True in general, but here we really do a lot under that ->d_lock - all
> > list traversals are under it. So I suspect that contention on nested
> > lock is not an issue in that particular load. It's certainly a separate
> > commit, so we'll see how much does it give on its own, but I doubt that
> > it'll be anywhere near enough.
>
> Hmm. Maybe.
>
> But at least we can try to minimize everything that happens under the
> dentry->d_lock spinlock.
>
> So how about this patch? It's entirely untested, but it rewrites that
> readdir() function to try to do the minimum possible under the d_lock
> spinlock.
>
> I say "rewrite", because it really is totally different. It's not just
> that the nested "next" locking is gone, it also treats the cursor very
> differently and tries to avoid doing any unnecessary cursor list
> operations.
Similar to what I've got here, except that mine has a couple of helper
functions usable in dcache_dir_lseek() as well:
next_positive(parent, child, n) - returns nth positive child after that one
or NULL if there's less than n such. NULL as the second argument => search
from the beginning.
move_cursor(cursor, child) - moves cursor immediately past child *or* to
the very end if child is NULL.
The third commit in series will be the lockless replacement for
for next_positive(). move_cursor() is easy - it became simply
struct dentry *parent = cursor->d_parent;
unsigned n, *seq = &parent->d_inode->i_dir_seq;
spin_lock(&parent->d_lock);
for (;;) {
n = *seq;
if (!(n & 1) && cmpxchg(seq, n, n + 1) == n)
break;
cpu_relax();
}
__list_del(cursor->d_child.prev, cursor->d_child.next);
if (child)
list_add(&cursor->d_child, &child->d_child);
else
list_add_tail(&cursor->d_child, &parent->d_subdirs);
smp_store_release(seq, n + 2);
spin_unlock(&parent->d_lock);
with
static struct dentry *next_positive(struct dentry *parent,
struct dentry *child, int count)
{
struct list_head *p = child ? &child->d_child : &parent->d_subdirs;
unsigned *seq = &parent->d_inode->i_dir_seq, n;
do {
int i = count;
n = smp_load_acquire(seq) & ~1;
rcu_read_lock();
do {
p = p->next;
if (p == &parent->d_subdirs) {
child = NULL;
break;
}
child = list_entry(p, struct dentry, d_child);
} while (!simple_positive(child) || --i);
rcu_read_unlock();
} while (unlikely(smp_load_acquire(seq) != n));
return child;
}
as initial attempt at lockless next_positive(); barriers are probably wrong,
though...
Powered by blists - more mailing lists