[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260113-mondlicht-raven-82fc4eb70e9d@brauner>
Date: Tue, 13 Jan 2026 09:54:15 +0100
From: Christian Brauner <brauner@...nel.org>
To: Jeff Layton <jlayton@...nel.org>, Amir Goldstein <amir73il@...il.com>
Cc: Chuck Lever <chuck.lever@...cle.com>, Jan Kara <jack@...e.cz>,
Luis de Bethencourt <luisbg@...nel.org>, Salah Triki <salah.triki@...il.com>,
Nicolas Pitre <nico@...xnic.net>, Christoph Hellwig <hch@...radead.org>,
Anders Larsen <al@...rsen.net>, Alexander Viro <viro@...iv.linux.org.uk>,
David Sterba <dsterba@...e.com>, Chris Mason <clm@...com>, Gao Xiang <xiang@...nel.org>,
Chao Yu <chao@...nel.org>, Yue Hu <zbestahu@...il.com>,
Jeffle Xu <jefflexu@...ux.alibaba.com>, Sandeep Dhavale <dhavale@...gle.com>,
Hongbo Li <lihongbo22@...wei.com>, Chunhai Guo <guochunhai@...o.com>, Jan Kara <jack@...e.com>,
Theodore Ts'o <tytso@....edu>, Andreas Dilger <adilger.kernel@...ger.ca>,
Jaegeuk Kim <jaegeuk@...nel.org>, OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>,
David Woodhouse <dwmw2@...radead.org>, Richard Weinberger <richard@....at>,
Dave Kleikamp <shaggy@...nel.org>, Ryusuke Konishi <konishi.ryusuke@...il.com>,
Viacheslav Dubeyko <slava@...eyko.com>, Konstantin Komarov <almaz.alexandrovich@...agon-software.com>,
Mark Fasheh <mark@...heh.com>, Joel Becker <jlbec@...lplan.org>,
Joseph Qi <joseph.qi@...ux.alibaba.com>, Mike Marshall <hubcap@...ibond.com>,
Martin Brandenburg <martin@...ibond.com>, Miklos Szeredi <miklos@...redi.hu>,
Phillip Lougher <phillip@...ashfs.org.uk>, Carlos Maiolino <cem@...nel.org>,
Hugh Dickins <hughd@...gle.com>, Baolin Wang <baolin.wang@...ux.alibaba.com>,
Andrew Morton <akpm@...ux-foundation.org>, Namjae Jeon <linkinjeon@...nel.org>,
Sungjong Seo <sj1557.seo@...sung.com>, Yuezhang Mo <yuezhang.mo@...y.com>,
Alexander Aring <alex.aring@...il.com>, Andreas Gruenbacher <agruenba@...hat.com>,
Jonathan Corbet <corbet@....net>, "Matthew Wilcox (Oracle)" <willy@...radead.org>,
Eric Van Hensbergen <ericvh@...nel.org>, Latchesar Ionkov <lucho@...kov.net>,
Dominique Martinet <asmadeus@...ewreck.org>, Christian Schoenebeck <linux_oss@...debyte.com>,
Xiubo Li <xiubli@...hat.com>, Ilya Dryomov <idryomov@...il.com>,
Trond Myklebust <trondmy@...nel.org>, Anna Schumaker <anna@...nel.org>,
Steve French <sfrench@...ba.org>, Paulo Alcantara <pc@...guebit.org>,
Ronnie Sahlberg <ronniesahlberg@...il.com>, Shyam Prasad N <sprasad@...rosoft.com>,
Tom Talpey <tom@...pey.com>, Bharath SM <bharathsm@...rosoft.com>,
Hans de Goede <hansg@...nel.org>, linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-btrfs@...r.kernel.org, linux-erofs@...ts.ozlabs.org, linux-ext4@...r.kernel.org,
linux-f2fs-devel@...ts.sourceforge.net, linux-mtd@...ts.infradead.org,
jfs-discussion@...ts.sourceforge.net, linux-nilfs@...r.kernel.org, ntfs3@...ts.linux.dev,
ocfs2-devel@...ts.linux.dev, devel@...ts.orangefs.org, linux-unionfs@...r.kernel.org,
linux-xfs@...r.kernel.org, linux-mm@...ck.org, gfs2@...ts.linux.dev,
linux-doc@...r.kernel.org, v9fs@...ts.linux.dev, ceph-devel@...r.kernel.org,
linux-nfs@...r.kernel.org, linux-cifs@...r.kernel.org, samba-technical@...ts.samba.org
Subject: Re: [PATCH 00/24] vfs: require filesystems to explicitly opt-in to
lease support
On Mon, Jan 12, 2026 at 09:50:20AM -0500, Jeff Layton wrote:
> On Mon, 2026-01-12 at 09:31 -0500, Chuck Lever wrote:
> > On 1/12/26 8:34 AM, Jeff Layton wrote:
> > > On Fri, 2026-01-09 at 19:52 +0100, Amir Goldstein wrote:
> > > > On Thu, Jan 8, 2026 at 7:57 PM Jeff Layton <jlayton@...nel.org> wrote:
> > > > >
> > > > > On Thu, 2026-01-08 at 18:40 +0100, Jan Kara wrote:
> > > > > > On Thu 08-01-26 12:12:55, Jeff Layton wrote:
> > > > > > > Yesterday, I sent patches to fix how directory delegation support is
> > > > > > > handled on filesystems where the should be disabled [1]. That set is
> > > > > > > appropriate for v6.19. For v7.0, I want to make lease support be more
> > > > > > > opt-in, rather than opt-out:
> > > > > > >
> > > > > > > For historical reasons, when ->setlease() file_operation is set to NULL,
> > > > > > > the default is to use the kernel-internal lease implementation. This
> > > > > > > means that if you want to disable them, you need to explicitly set the
> > > > > > > ->setlease() file_operation to simple_nosetlease() or the equivalent.
> > > > > > >
> > > > > > > This has caused a number of problems over the years as some filesystems
> > > > > > > have inadvertantly allowed leases to be acquired simply by having left
> > > > > > > it set to NULL. It would be better if filesystems had to opt-in to lease
> > > > > > > support, particularly with the advent of directory delegations.
> > > > > > >
> > > > > > > This series has sets the ->setlease() operation in a pile of existing
> > > > > > > local filesystems to generic_setlease() and then changes
> > > > > > > kernel_setlease() to return -EINVAL when the setlease() operation is not
> > > > > > > set.
> > > > > > >
> > > > > > > With this change, new filesystems will need to explicitly set the
> > > > > > > ->setlease() operations in order to provide lease and delegation
> > > > > > > support.
> > > > > > >
> > > > > > > I mainly focused on filesystems that are NFS exportable, since NFS and
> > > > > > > SMB are the main users of file leases, and they tend to end up exporting
> > > > > > > the same filesystem types. Let me know if I've missed any.
> > > > > >
> > > > > > So, what about kernfs and fuse? They seem to be exportable and don't have
> > > > > > .setlease set...
> > > > > >
> > > > >
> > > > > Yes, FUSE needs this too. I'll add a patch for that.
> > > > >
> > > > > As far as kernfs goes: AIUI, that's basically what sysfs and resctrl
> > > > > are built on. Do we really expect people to set leases there?
> > > > >
> > > > > I guess it's technically a regression since you could set them on those
> > > > > sorts of files earlier, but people don't usually export kernfs based
> > > > > filesystems via NFS or SMB, and that seems like something that could be
> > > > > used to make mischief.
> > > > >
> > > > > AFAICT, kernfs_export_ops is mostly to support open_by_handle_at(). See
> > > > > commit aa8188253474 ("kernfs: add exportfs operations").
> > > > >
> > > > > One idea: we could add a wrapper around generic_setlease() for
> > > > > filesystems like this that will do a WARN_ONCE() and then call
> > > > > generic_setlease(). That would keep leases working on them but we might
> > > > > get some reports that would tell us who's setting leases on these files
> > > > > and why.
> > > >
> > > > IMO, you are being too cautious, but whatever.
> > > >
> > > > It is not accurate that kernfs filesystems are NFS exportable in general.
> > > > Only cgroupfs has KERNFS_ROOT_SUPPORT_EXPORTOP.
> > > >
> > > > If any application is using leases on cgroup files, it must be some
> > > > very advanced runtime (i.e. systemd), so we should know about the
> > > > regression sooner rather than later.
> > > >
> > >
> > > I think so too. For now, I think I'll not bother with the WARN_ONCE().
> > > Let's just leave kernfs out of the set until someone presents a real
> > > use-case.
> > >
> > > > There are also the recently added nsfs and pidfs export_operations.
> > > >
> > > > I have a recollection about wanting to be explicit about not allowing
> > > > those to be exportable to NFS (nsfs specifically), but I can't see where
> > > > and if that restriction was done.
> > > >
> > > > Christian? Do you remember?
> > > >
> > >
> > > (cc'ing Chuck)
> > >
> > > FWIW, you can currently export and mount /sys/fs/cgroup via NFS. The
> > > directory doesn't show up when you try to get to it via NFSv4, but you
> > > can mount it using v3 and READDIR works. The files are all empty when
> > > you try to read them. I didn't try to do any writes.
> > >
> > > Should we add a mechanism to prevent exporting these sorts of
> > > filesystems?
> > >
> > > Even better would be to make nfsd exporting explicitly opt-in. What if
> > > we were to add a EXPORT_OP_NFSD flag that explicitly allows filesystems
> > > to opt-in to NFS exporting, and check for that in __fh_verify()? We'd
> > > have to add it to a bunch of existing filesystems, but that's fairly
> > > simple to do with an LLM.
> >
> > What's the active harm in exporting /sys/fs/cgroup ? It has to be done
> > explicitly via /etc/exports, so this is under the NFS server admin's
> > control. Is it an attack surface?
> >
>
> Potentially?
>
> I don't see any active harm with exporting cgroupfs. It doesn't work
> right via nfsd, but it's not crashing the box or anything.
>
> At one time, those were only defined by filesystems that wanted to
> allow NFS export. Now we've grown them on filesystems that just want to
> provide filehandles for open_by_handle_at() and the like. nfsd doesn't
> care though: if the fs has export operations, it'll happily use them.
>
> Having an explicit "I want to allow nfsd" flag see ms like it might
> save us some headaches in the future when other filesystems add export
> ops for this sort of filehandle use.
So we are re-hashing a discussion we had a few months ago (Amir was
involved at least).
I don't think we want to expose cgroupfs via NFS that's super weird.
It's like remote partial resource management and it would be very
strange if a remote process suddenly would be able to move things around
in the cgroup tree. So I would prefer to not do this.
So my preference would be to really sever file handles from the export
mechanism so that we can allow stuff like pidfs and nsfs and cgroupfs to
use file handles via name_to_handle_at() and open_by_handle_at() without
making them exportable.
Somehow I thought that Amir had already done that work a while ago but
maybe it was really just about name_to_handle_at() and not also
open_by_handle_at()...
Powered by blists - more mailing lists