[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b875fdb47e17ab68d18c5e5e5cbd0ec70fec7ce9.camel@kernel.org>
Date: Sun, 28 Apr 2019 11:47:58 -0400
From: Jeff Layton <jlayton@...nel.org>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Ilya Dryomov <idryomov@...il.com>, ceph-devel@...r.kernel.org,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
linux-cifs <linux-cifs@...r.kernel.org>
Subject: Re: [GIT PULL] Ceph fixes for 5.1-rc7
On Sun, 2019-04-28 at 15:48 +0100, Al Viro wrote:
> On Sun, Apr 28, 2019 at 09:27:20AM -0400, Jeff Layton wrote:
>
> > I don't see a problem doing what you suggest. An offset + fixed length
> > buffer would be fine there.
> >
> > Is there a real benefit to using __getname though? It sucks when we have
> > to reallocate but I doubt that it happens with any frequency. Most of
> > these paths will end up being much shorter than PATH_MAX and that slims
> > down the memory footprint a bit.
>
> AFAICS, they are all short-lived; don't forget that slabs have cache,
> so in that situation allocations are cheap.
>
Fair enough. Al also pointed out on IRC that the __getname/__putname
caches are likely to be hot, so using that may be less costly cpu-wise.
> > Also, FWIW -- this code was originally copied from cifs'
> > build_path_from_dentry(). Should we aim to put something in common
> > infrastructure that both can call?
> >
> > There are some significant logic differences in the two functions though
> > so we might need some sort of callback function or something to know
> > when to stop walking.
>
> Not if you want it fast... Indirect calls are not cheap; the cost of
> those callbacks would be considerable. Besides, you want more than
> "where do I stop", right? It's also "what output do I use for this
> dentry", both for you and for cifs (there it's "which separator to use",
> in ceph it's "these we want represented as //")...
>
> Can it be called on detached subtree, during e.g. open_by_handle()?
> There it can get really fishy; you end up with base being at the
> random point on the way towards root. How does that work, and if
> it *does* work, why do we need the whole path in the first place?
>
This I'm not sure of. commit 79b33c8874334e (ceph: snapshot nfs re-
export) explains this a bit, but I'm not sure it really covers this
case.
Zheng/Sage, feel free to correct me here:
My understanding is that for snapshots you need the base inode number,
snapid, and the full path from there to the dentry for a ceph MDS call.
There is a filehandle type for a snapshotted inode:
struct ceph_nfs_snapfh {
u64 ino;
u64 snapid;
u64 parent_ino;
u32 hash;
} __attribute__ ((packed));
So I guess it is possible. You could do name_to_handle_at for an inode
deep down in a snapshotted tree, and then try to open_by_handle_at after
the dcache gets cleaned out for some other reason.
What I'm not clear on is why we need to build paths at all for
snapshots. Why is a parent inode number (inside the snapshot) + a snapid
+ dentry name not sufficient?
> BTW, for cifs there's no need to play with ->d_lock as we go. For
> ceph, the only need comes from looking at d_inode(), and I wonder if
> it would be better to duplicate that information ("is that a
> snapdir/nosnap") into dentry iself - would certainly be cheaper.
> OTOH, we are getting short on spare bits in ->d_flags...
We could stick that in ceph_dentry_info (->d_fsdata). We have a flags
field in there already.
--
Jeff Layton <jlayton@...nel.org>
Powered by blists - more mailing lists