[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250819223210.GG7942@frogsfrogsfrogs>
Date: Tue, 19 Aug 2025 15:32:10 -0700
From: "Darrick J. Wong" <djwong@...nel.org>
To: John Groves <John@...ves.net>
Cc: Miklos Szeredi <miklos@...redi.hu>,
Dan Williams <dan.j.williams@...el.com>,
Bernd Schubert <bschubert@....com>,
John Groves <jgroves@...ron.com>, Jonathan Corbet <corbet@....net>,
Vishal Verma <vishal.l.verma@...el.com>,
Dave Jiang <dave.jiang@...el.com>,
Matthew Wilcox <willy@...radead.org>, Jan Kara <jack@...e.cz>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>,
Randy Dunlap <rdunlap@...radead.org>,
Jeff Layton <jlayton@...nel.org>,
Kent Overstreet <kent.overstreet@...ux.dev>,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
nvdimm@...ts.linux.dev, linux-cxl@...r.kernel.org,
linux-fsdevel@...r.kernel.org, Amir Goldstein <amir73il@...il.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Stefan Hajnoczi <shajnocz@...hat.com>,
Joanne Koong <joannelkoong@...il.com>,
Josef Bacik <josef@...icpanda.com>,
Aravind Ramesh <arramesh@...ron.com>,
Ajay Joshi <ajayjoshi@...ron.com>
Subject: Re: [RFC V2 14/18] famfs_fuse: GET_DAXDEV message and daxdev_table
On Sat, Aug 16, 2025 at 11:22:49AM -0500, John Groves wrote:
> On 25/08/14 08:25PM, Miklos Szeredi wrote:
> > On Thu, 14 Aug 2025 at 19:19, Darrick J. Wong <djwong@...nel.org> wrote:
> > > What happens if you want to have a fuse server that hosts both famfs
> > > files /and/ backing files? That'd be pretty crazy to mix both paths in
> > > one filesystem, but it's in theory possible, particularly if the famfs
> > > server wanted to export a pseudofile where everyone could find that
> > > shadow file?
> >
> > Either FUSE_DEV_IOC_BACKING_OPEN detects what kind of object it has
> > been handed, or we add a flag that explicitly says this is a dax dev
> > or a block dev or a regular file. I'd prefer the latter.
> >
> > Thanks,
> > Miklos
>
> I have future ideas of famfs supporting non-dax-memory files in a mixed
> namespace with normal famfs dax files. This seems like the simplest way
> to relax the "files are strictly pre-allocated" rule. But I think this
> is orthogonal to how fmaps and backing devs are passed into the kernel.
>
> The way I'm thinking about it, the difference would be handled in
> read/write/mmap. Taking fuse_file_read_iter as the example, the code
> currently looks like this:
>
> if (FUSE_IS_VIRTIO_DAX(fi))
> return fuse_dax_read_iter(iocb, to);
> if (fuse_file_famfs(fi))
> return famfs_fuse_read_iter(iocb, to);
>
> /* FOPEN_DIRECT_IO overrides FOPEN_PASSTHROUGH */
> if (ff->open_flags & FOPEN_DIRECT_IO)
> return fuse_direct_read_iter(iocb, to);
> else if (fuse_file_passthrough(ff))
> return fuse_passthrough_read_iter(iocb, to);
> else
> return fuse_cache_read_iter(iocb, to);
>
> If the famfs fuse servert wants a particular file handled via another
> mechanism -- e.g. READ message to server or passthrough -- the famfs
> fuse server can just provide an fmap that indicates such. Then
> fuse_file_famfs(fi) would return false for that file, and it would be
> handled through other existing mechanisms (which the famfs fuse
> server would have to handle correctly).
>
> Famfs could, for example, allow files to be created as generic or
> passthrough, and then have a "commit" step that allocated dax memory,
> moved the data from a non-dax into dax, and appended the file to the
> famfs metadata log - flipping the file to full-monty-famfs (tm).
> Prior to the "commit", performance is less but all manner of mutations
> could be allowed.
>
> So I don't think this looks very be hard, and it's independent of the
> mechanism by which fmaps get into the kernel.
This is one thing I wasn't planning -- iomap files are always that, and
there's no fallback to any of the other IO strategies. The pagecache
handling parts of iomap require things such as i_rwsem controlling
access to a file no matter how many places it's hardlinked, and
timestamp/mode/acl handling working more or less the same way they do in
xfs and ext4. iomap isn't all that congruent with the way that the
other IO paths (passthrough, writeback_cache, and "directio" files)
work.
Though to undercut my own point partially, sending an "inline data"
mapping to the kernel causes it to call FUSE_READ/FUSE_WRITE and then
you can inject whatever IO path you want. OTOH the iomap inlinedata
paths are ... not well tested for pos > 0.
--D
> Regards,
> John
>
>
>
Powered by blists - more mailing lists