[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOQ4uxhkaGFtQRzTj2xaf2GJucoAY5CGiyUjB=8YA2zTbOtFvw@mail.gmail.com>
Date: Tue, 13 Jan 2026 12:03:37 +0100
From: Amir Goldstein <amir73il@...il.com>
To: Christian Brauner <brauner@...nel.org>
Cc: Jeff Layton <jlayton@...nel.org>, Chuck Lever <chuck.lever@...cle.com>, Jan Kara <jack@...e.cz>,
Luis de Bethencourt <luisbg@...nel.org>, Salah Triki <salah.triki@...il.com>,
Nicolas Pitre <nico@...xnic.net>, Christoph Hellwig <hch@...radead.org>, Anders Larsen <al@...rsen.net>,
Alexander Viro <viro@...iv.linux.org.uk>, David Sterba <dsterba@...e.com>, Chris Mason <clm@...com>,
Gao Xiang <xiang@...nel.org>, Chao Yu <chao@...nel.org>, Yue Hu <zbestahu@...il.com>,
Jeffle Xu <jefflexu@...ux.alibaba.com>, Sandeep Dhavale <dhavale@...gle.com>,
Hongbo Li <lihongbo22@...wei.com>, Chunhai Guo <guochunhai@...o.com>, Jan Kara <jack@...e.com>,
"Theodore Ts'o" <tytso@....edu>, Andreas Dilger <adilger.kernel@...ger.ca>,
Jaegeuk Kim <jaegeuk@...nel.org>, OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>,
David Woodhouse <dwmw2@...radead.org>, Richard Weinberger <richard@....at>, Dave Kleikamp <shaggy@...nel.org>,
Ryusuke Konishi <konishi.ryusuke@...il.com>, Viacheslav Dubeyko <slava@...eyko.com>,
Konstantin Komarov <almaz.alexandrovich@...agon-software.com>, Mark Fasheh <mark@...heh.com>,
Joel Becker <jlbec@...lplan.org>, Joseph Qi <joseph.qi@...ux.alibaba.com>,
Mike Marshall <hubcap@...ibond.com>, Martin Brandenburg <martin@...ibond.com>,
Miklos Szeredi <miklos@...redi.hu>, Phillip Lougher <phillip@...ashfs.org.uk>,
Carlos Maiolino <cem@...nel.org>, Hugh Dickins <hughd@...gle.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>, Andrew Morton <akpm@...ux-foundation.org>,
Namjae Jeon <linkinjeon@...nel.org>, Sungjong Seo <sj1557.seo@...sung.com>,
Yuezhang Mo <yuezhang.mo@...y.com>, Alexander Aring <alex.aring@...il.com>,
Andreas Gruenbacher <agruenba@...hat.com>, Jonathan Corbet <corbet@....net>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>, Eric Van Hensbergen <ericvh@...nel.org>,
Latchesar Ionkov <lucho@...kov.net>, Dominique Martinet <asmadeus@...ewreck.org>,
Christian Schoenebeck <linux_oss@...debyte.com>, Xiubo Li <xiubli@...hat.com>,
Ilya Dryomov <idryomov@...il.com>, Trond Myklebust <trondmy@...nel.org>,
Anna Schumaker <anna@...nel.org>, Steve French <sfrench@...ba.org>, Paulo Alcantara <pc@...guebit.org>,
Ronnie Sahlberg <ronniesahlberg@...il.com>, Shyam Prasad N <sprasad@...rosoft.com>,
Tom Talpey <tom@...pey.com>, Bharath SM <bharathsm@...rosoft.com>,
Hans de Goede <hansg@...nel.org>, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-btrfs@...r.kernel.org,
linux-erofs@...ts.ozlabs.org, linux-ext4@...r.kernel.org,
linux-f2fs-devel@...ts.sourceforge.net, linux-mtd@...ts.infradead.org,
jfs-discussion@...ts.sourceforge.net, linux-nilfs@...r.kernel.org,
ntfs3@...ts.linux.dev, ocfs2-devel@...ts.linux.dev, devel@...ts.orangefs.org,
linux-unionfs@...r.kernel.org, linux-xfs@...r.kernel.org, linux-mm@...ck.org,
gfs2@...ts.linux.dev, linux-doc@...r.kernel.org, v9fs@...ts.linux.dev,
ceph-devel@...r.kernel.org, linux-nfs@...r.kernel.org,
linux-cifs@...r.kernel.org, samba-technical@...ts.samba.org
Subject: Re: [PATCH 00/24] vfs: require filesystems to explicitly opt-in to
lease support
On Tue, Jan 13, 2026 at 9:54 AM Christian Brauner <brauner@...nel.org> wrote:
>
> On Mon, Jan 12, 2026 at 09:50:20AM -0500, Jeff Layton wrote:
> > On Mon, 2026-01-12 at 09:31 -0500, Chuck Lever wrote:
> > > On 1/12/26 8:34 AM, Jeff Layton wrote:
> > > > On Fri, 2026-01-09 at 19:52 +0100, Amir Goldstein wrote:
> > > > > On Thu, Jan 8, 2026 at 7:57 PM Jeff Layton <jlayton@...nel.org> wrote:
> > > > > >
> > > > > > On Thu, 2026-01-08 at 18:40 +0100, Jan Kara wrote:
> > > > > > > On Thu 08-01-26 12:12:55, Jeff Layton wrote:
> > > > > > > > Yesterday, I sent patches to fix how directory delegation support is
> > > > > > > > handled on filesystems where the should be disabled [1]. That set is
> > > > > > > > appropriate for v6.19. For v7.0, I want to make lease support be more
> > > > > > > > opt-in, rather than opt-out:
> > > > > > > >
> > > > > > > > For historical reasons, when ->setlease() file_operation is set to NULL,
> > > > > > > > the default is to use the kernel-internal lease implementation. This
> > > > > > > > means that if you want to disable them, you need to explicitly set the
> > > > > > > > ->setlease() file_operation to simple_nosetlease() or the equivalent.
> > > > > > > >
> > > > > > > > This has caused a number of problems over the years as some filesystems
> > > > > > > > have inadvertantly allowed leases to be acquired simply by having left
> > > > > > > > it set to NULL. It would be better if filesystems had to opt-in to lease
> > > > > > > > support, particularly with the advent of directory delegations.
> > > > > > > >
> > > > > > > > This series has sets the ->setlease() operation in a pile of existing
> > > > > > > > local filesystems to generic_setlease() and then changes
> > > > > > > > kernel_setlease() to return -EINVAL when the setlease() operation is not
> > > > > > > > set.
> > > > > > > >
> > > > > > > > With this change, new filesystems will need to explicitly set the
> > > > > > > > ->setlease() operations in order to provide lease and delegation
> > > > > > > > support.
> > > > > > > >
> > > > > > > > I mainly focused on filesystems that are NFS exportable, since NFS and
> > > > > > > > SMB are the main users of file leases, and they tend to end up exporting
> > > > > > > > the same filesystem types. Let me know if I've missed any.
> > > > > > >
> > > > > > > So, what about kernfs and fuse? They seem to be exportable and don't have
> > > > > > > .setlease set...
> > > > > > >
> > > > > >
> > > > > > Yes, FUSE needs this too. I'll add a patch for that.
> > > > > >
> > > > > > As far as kernfs goes: AIUI, that's basically what sysfs and resctrl
> > > > > > are built on. Do we really expect people to set leases there?
> > > > > >
> > > > > > I guess it's technically a regression since you could set them on those
> > > > > > sorts of files earlier, but people don't usually export kernfs based
> > > > > > filesystems via NFS or SMB, and that seems like something that could be
> > > > > > used to make mischief.
> > > > > >
> > > > > > AFAICT, kernfs_export_ops is mostly to support open_by_handle_at(). See
> > > > > > commit aa8188253474 ("kernfs: add exportfs operations").
> > > > > >
> > > > > > One idea: we could add a wrapper around generic_setlease() for
> > > > > > filesystems like this that will do a WARN_ONCE() and then call
> > > > > > generic_setlease(). That would keep leases working on them but we might
> > > > > > get some reports that would tell us who's setting leases on these files
> > > > > > and why.
> > > > >
> > > > > IMO, you are being too cautious, but whatever.
> > > > >
> > > > > It is not accurate that kernfs filesystems are NFS exportable in general.
> > > > > Only cgroupfs has KERNFS_ROOT_SUPPORT_EXPORTOP.
> > > > >
> > > > > If any application is using leases on cgroup files, it must be some
> > > > > very advanced runtime (i.e. systemd), so we should know about the
> > > > > regression sooner rather than later.
> > > > >
> > > >
> > > > I think so too. For now, I think I'll not bother with the WARN_ONCE().
> > > > Let's just leave kernfs out of the set until someone presents a real
> > > > use-case.
> > > >
> > > > > There are also the recently added nsfs and pidfs export_operations.
> > > > >
> > > > > I have a recollection about wanting to be explicit about not allowing
> > > > > those to be exportable to NFS (nsfs specifically), but I can't see where
> > > > > and if that restriction was done.
> > > > >
> > > > > Christian? Do you remember?
> > > > >
> > > >
> > > > (cc'ing Chuck)
> > > >
> > > > FWIW, you can currently export and mount /sys/fs/cgroup via NFS. The
> > > > directory doesn't show up when you try to get to it via NFSv4, but you
> > > > can mount it using v3 and READDIR works. The files are all empty when
> > > > you try to read them. I didn't try to do any writes.
> > > >
> > > > Should we add a mechanism to prevent exporting these sorts of
> > > > filesystems?
> > > >
> > > > Even better would be to make nfsd exporting explicitly opt-in. What if
> > > > we were to add a EXPORT_OP_NFSD flag that explicitly allows filesystems
> > > > to opt-in to NFS exporting, and check for that in __fh_verify()? We'd
> > > > have to add it to a bunch of existing filesystems, but that's fairly
> > > > simple to do with an LLM.
> > >
> > > What's the active harm in exporting /sys/fs/cgroup ? It has to be done
> > > explicitly via /etc/exports, so this is under the NFS server admin's
> > > control. Is it an attack surface?
> > >
> >
> > Potentially?
> >
> > I don't see any active harm with exporting cgroupfs. It doesn't work
> > right via nfsd, but it's not crashing the box or anything.
> >
> > At one time, those were only defined by filesystems that wanted to
> > allow NFS export. Now we've grown them on filesystems that just want to
> > provide filehandles for open_by_handle_at() and the like. nfsd doesn't
> > care though: if the fs has export operations, it'll happily use them.
> >
> > Having an explicit "I want to allow nfsd" flag see ms like it might
> > save us some headaches in the future when other filesystems add export
> > ops for this sort of filehandle use.
>
> So we are re-hashing a discussion we had a few months ago (Amir was
> involved at least).
>
> I don't think we want to expose cgroupfs via NFS that's super weird.
> It's like remote partial resource management and it would be very
> strange if a remote process suddenly would be able to move things around
> in the cgroup tree. So I would prefer to not do this.
>
> So my preference would be to really sever file handles from the export
> mechanism so that we can allow stuff like pidfs and nsfs and cgroupfs to
> use file handles via name_to_handle_at() and open_by_handle_at() without
> making them exportable.
>
> Somehow I thought that Amir had already done that work a while ago but
> maybe it was really just about name_to_handle_at() and not also
> open_by_handle_at()...
I don't recall doing anything except talking ;)
How about something like this to safeguard against exporting
the new pidfs/nsfs.
Regarding cgroupfs, we could either use a EXPORT_OP_ flag
or maybe it should have a custom open/permission as well?
Thanks,
Amir.
View attachment "0001-nfsd-do-not-allow-exporting-of-special-kernel-filesy.patch" of type "text/x-patch" (2242 bytes)
Powered by blists - more mailing lists