[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <176899164457.16766.16099772451425825775@noble.neil.brown.name>
Date: Wed, 21 Jan 2026 21:34:04 +1100
From: NeilBrown <neilb@...mail.net>
To: "Christoph Hellwig" <hch@...radead.org>
Cc: "Christoph Hellwig" <hch@...radead.org>,
"Christian Brauner" <brauner@...nel.org>,
"Jeff Layton" <jlayton@...nel.org>,
"Amir Goldstein" <amir73il@...il.com>,
"Alexander Viro" <viro@...iv.linux.org.uk>,
"Chuck Lever" <chuck.lever@...cle.com>,
"Olga Kornievskaia" <okorniev@...hat.com>,
"Dai Ngo" <Dai.Ngo@...cle.com>, "Tom Talpey" <tom@...pey.com>,
"Hugh Dickins" <hughd@...gle.com>,
"Baolin Wang" <baolin.wang@...ux.alibaba.com>,
"Andrew Morton" <akpm@...ux-foundation.org>,
"Theodore Ts'o" <tytso@....edu>,
"Andreas Dilger" <adilger.kernel@...ger.ca>, "Jan Kara" <jack@...e.com>,
"Gao Xiang" <xiang@...nel.org>, "Chao Yu" <chao@...nel.org>,
"Yue Hu" <zbestahu@...il.com>, "Jeffle Xu" <jefflexu@...ux.alibaba.com>,
"Sandeep Dhavale" <dhavale@...gle.com>,
"Hongbo Li" <lihongbo22@...wei.com>, "Chunhai Guo" <guochunhai@...o.com>,
"Carlos Maiolino" <cem@...nel.org>, "Ilya Dryomov" <idryomov@...il.com>,
"Alex Markuze" <amarkuze@...hat.com>,
"Viacheslav Dubeyko" <slava@...eyko.com>, "Chris Mason" <clm@...com>,
"David Sterba" <dsterba@...e.com>,
"Luis de Bethencourt" <luisbg@...nel.org>,
"Salah Triki" <salah.triki@...il.com>,
"Phillip Lougher" <phillip@...ashfs.org.uk>,
"Steve French" <sfrench@...ba.org>, "Paulo Alcantara" <pc@...guebit.org>,
"Ronnie Sahlberg" <ronniesahlberg@...il.com>,
"Shyam Prasad N" <sprasad@...rosoft.com>,
"Bharath SM" <bharathsm@...rosoft.com>,
"Miklos Szeredi" <miklos@...redi.hu>,
"Mike Marshall" <hubcap@...ibond.com>,
"Martin Brandenburg" <martin@...ibond.com>,
"Mark Fasheh" <mark@...heh.com>, "Joel Becker" <jlbec@...lplan.org>,
"Joseph Qi" <joseph.qi@...ux.alibaba.com>,
"Konstantin Komarov" <almaz.alexandrovich@...agon-software.com>,
"Ryusuke Konishi" <konishi.ryusuke@...il.com>,
"Trond Myklebust" <trondmy@...nel.org>,
"Anna Schumaker" <anna@...nel.org>, "Dave Kleikamp" <shaggy@...nel.org>,
"David Woodhouse" <dwmw2@...radead.org>,
"Richard Weinberger" <richard@....at>, "Jan Kara" <jack@...e.cz>,
"Andreas Gruenbacher" <agruenba@...hat.com>,
"OGAWA Hirofumi" <hirofumi@...l.parknet.co.jp>,
"Jaegeuk Kim" <jaegeuk@...nel.org>, linux-nfs@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-ext4@...r.kernel.org,
linux-erofs@...ts.ozlabs.org, linux-xfs@...r.kernel.org,
ceph-devel@...r.kernel.org, linux-btrfs@...r.kernel.org,
linux-cifs@...r.kernel.org, linux-unionfs@...r.kernel.org,
devel@...ts.orangefs.org, ocfs2-devel@...ts.linux.dev,
ntfs3@...ts.linux.dev, linux-nilfs@...r.kernel.org,
jfs-discussion@...ts.sourceforge.net, linux-mtd@...ts.infradead.org,
gfs2@...ts.linux.dev, linux-f2fs-devel@...ts.sourceforge.net
Subject: Re: [PATCH 00/29] fs: require filesystems to explicitly opt-in to
nfsd export support
On Wed, 21 Jan 2026, Christoph Hellwig wrote:
> On Tue, Jan 20, 2026 at 08:27:46PM +1100, NeilBrown wrote:
> > > If you think NFS actually explains the semantics pretty well, please
> > > explain that too, especially in forms that can be put into
> > > documentation, including for the user ABI.
> >
> > There are multiple issues here:
> >
> > - filehandle stability. As far as I know all filesystems provide
> > stable filehandles when the "subtree_check" export option is not used.
>
> That is news to me, but certainly interesting. Does this include not
> reusing the file handle for a new incarnation of the same thing?
"stable" and "reuse" are quite distinct concepts in my mind.
"a new incarnation of the same thing" is in my experience a new thing.
rmdir foo: mkdir foo
on an empty directory will create a new incarnation of the same thing.
But it will appear to be different in various ways.
Names, not file handles, are generally used for new incarnations of the
same thing (again - in my experience).
I cannot 100% guarantee that all fs's provide filehandle stability, but
I am not aware of any, and none have been presented in this discussion.
It is true that the NFSv4 spec claims to allow them but I find the
details provided insufficient.
They might be able to work reliably if the server provided a delegation, but
without it I don't think they can be used reliably. I'm certainly not
aware of any attempt to support them in Linux client or server.
(I know Trond doesn't like "connectable" file handles).
>
> > Certainly cgroupfs does. So having an EXPORT_OP_STABLE_HANDLES
> > flag would mean it was set for every filesystem - unless there is
> > something else I'm not aware of. That is certainly possible and I
> > hope someone will let me know if I'm missing something.
>
> Well, if does not provide stable file handles with the subtree_check
> export option, or more importantly with the CONNECTABLE flag passed
> to encode_fh, which is the level we're operating on, it can't set the
> flag.
>
Hmmm... I didn't know that open_by_handle_at() supported CONNECTABLE
requests. That seems relatively recent.
If CONNECTABLE is requested, then only directories get stable
filehandles.
If CONNECTABLE is not requested, then all filehandles should be stable.
> > - filehandle uniqueness. This is somewhat important and if a
> > filesystem doesn't provide it, that should be considered a bug. In a
> > different thread Christian has observed that there would be benefit
> > if pidfs and nsfs provided uniqueness across reboots. It is quite
> > easy for a virtual filesystem to generate a 64 bit random number when
> > the fs is initialised, and include that in file handles. Having a
> > EXPORT_OP_REUSES_HANDLES flag could mark filesystems that are still
> > buggy if that is thought to be useful.
>
> Yes.
>
> > - GETATTR always reporting file size of 0. This is the only concrete
> > symptom that Jeff has reported (that I have seen). This makes it
> > impossible to read files over NFS even if they have content.
> > Would EXPORT_OP_INACCURATE_SIZE be useful?
>
> i_size = 0 for a regular file sounds like a genuine bug to me. I'm
> actually surprised anything works with that.
Files in /proc are all size zero.
Files in /sys seem to be all 4096 (or maybe PAGE_SIZE).
Files in /sys/kernel/security are all size zero
Files in /sys/fs/cgroup are all zero
I agree it is weird, but it seems to work ... though I do have a vague
memory of something not working because it used a library function to
read a file, and it needed to be fixed. No details come to mind except
that it was probably md related.
As some of these virtual files can be different every time they are
read, there is TOCTOU issue with trying to make the i_size accurately
reflect the result of a subsequent read. I think the cost of setting an
accurate i_size even when it is possible is not seen as worth while.
>
> > - maintainer feature choice. A maintainer may choose not to support
> > export over NFS because they feel that there is no value and the
> > possible support burden would not be worth it.
>
> The maintainer has no way to disallow exporting through nfs. They can
> at best disallow exporting using the kernel nfs daemon if we provide
> that facility. But as I've argued multiple times, making arbitrary,
> selective and very narrow choices about use cases without technical
> backing for them (which then would be expressable as a flag like those
> listed by you above) is really bad software development practice, and
> not something that we usually do in the Linux kernel.
True: once you make files available to people you cannot control what
people will do with them.
So maybe you are saying "what is so special about knfsd that it gets
information that no-one else can get". I cannot argue against that.
>
> > There may be locking
> > / lease / etc issues that further complicate things. So it might be
> > reasonable for a maintainer to choose to forbid NFS export while
> > allowing local fhandle access. EXPORT_OP_NO_NFS_EXPORT.
>
> We already have a EXPORT_OP_NOLOCKS flag to deal with this.
>
> >
> > It took me a while to sift through the code/patches/comments and come to
> > this understanding and I apologise if I wasn't as clear earlier. But
> > my intuition was always that file handle stability was never the real
> > issue, and maintainer choice was. Hence my rejection of the
> > "STABLE_HANDLES" name.
>
> Why do you keep ignoring the fat that the stable handles are really
> important for anyone wanting to actually use them for their original
> storage purpose, be that for knfsd, a userland nfs damon, or other
> storage applications in userspace despite explaining this countless
> times?
>
It isn't that I don't think they are important. It is that I think they
are universally provided (when not connectable).
If we add an EXPORT_OP_STABLE_FILEHANDLES flag, I believe we would need to
set it on every export_operations structure. So what would be the
point?
Thanks,
NeilBrown
Powered by blists - more mailing lists