[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190428164030.GC23075@ZenIV.linux.org.uk>
Date: Sun, 28 Apr 2019 17:40:30 +0100
From: Al Viro <viro@...iv.linux.org.uk>
To: Jeff Layton <jlayton@...nel.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Ilya Dryomov <idryomov@...il.com>, ceph-devel@...r.kernel.org,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
linux-cifs <linux-cifs@...r.kernel.org>
Subject: Re: [GIT PULL] Ceph fixes for 5.1-rc7
On Sun, Apr 28, 2019 at 04:52:16PM +0100, Al Viro wrote:
> On Sun, Apr 28, 2019 at 11:47:58AM -0400, Jeff Layton wrote:
>
> > We could stick that in ceph_dentry_info (->d_fsdata). We have a flags
> > field in there already.
>
> Yes, but... You have it freed in ->d_release(), AFAICS, and without
> any delays. So lockless accesses will be trouble.
You could RCU-delay the actual kmem_cache_free(ceph_dentry_cachep, di)
in there, but I've no idea whether the overhead would be painful -
on massive eviction (e.g. on memory pressure) it might be. Another
variant is to introduce ->d_free(), to be called from __d_free()
and __d_free_external(). That, however, would need another ->d_flags
bit for presence of that method, so that we don't get extra overhead
from looking into ->d_op...
Looking through ->d_release() instances, we have
afs: empty, might as well have not been there
autofs: does some sync stuff (eviction from ->active_list/->expire_list)
plus kfree_rcu
ceph: some sync stuff + immediate kmem_cache_free()
debugfs: kfree(), might or might not be worth RCU-delaying
ecryptfs: sync stuff (path_put for ->lower) + RCU-delayed part
fuse: kfree_rcu()
nfs: kfree()
overlayfs: a bunch of dput() (obviously sync) + kfree_rcu()
9p: sync
So it actually might make sense to move the RCU-delayed bits to
separate method. Some ->d_release() instances would be simply
gone, as for the rest... I wonder which of the sync parts can
be moved over to ->d_prune(). Not guaranteed to be doable
(or a good idea), but... E.g. for autofs it almost certainly
would be the right place for the sync parts - we are,
essentially, telling the filesystem to forget its private
(non-refcounted) references to the victim.
Powered by blists - more mailing lists