[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <175573712721.20753.5223489399594191991.stgit@frogsfrogsfrogs>
Date: Wed, 20 Aug 2025 17:49:22 -0700
From: "Darrick J. Wong" <djwong@...nel.org>
To: tytso@....edu
Cc: amir73il@...il.com, John@...ves.net, bernd@...ernd.com,
linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org, miklos@...redi.hu,
amir73il@...il.com, joannelkoong@...il.com, neal@...pa.dev
Subject: [PATCHSET RFC v4 1/6] fuse4fs: fork a low level fuse server
Hi all,
Whilst developing the fuse2fs+iomap prototype, I discovered a
fundamental design limitation of the upper-level libfuse API: hardlinks.
The upper level fuse library really wants to communicate with the fuse
server with file paths, instead of using inode numbers. This works
great for filesystems that don't have inodes, create files dynamically
at runtime, or lack stable inode numbers.
Unfortunately, the libfuse path abstraction assigns a unique nodeid to
every child file in the entire filesystem, without regard to hard links.
In other words, a hardlinked regular file may have one ondisk inode
number but multiple kernel inodes. For classic fuse2fs this isn't a
problem because all file access goes through the fuse server and the big
library lock protects us from corruption.
For fuse2fs + iomap this is a disaster because we rely on the kernel to
coordinate access to inodes. For hardlinked files, we *require* that
there only be one in-kernel inode for each ondisk inode.
The path based mechanism is also very inefficient for fuse2fs. Every
time a file is accessed, the upper level libfuse passes a new nodeid to
the kernel, and on every file access the kernel passes that same nodeid
back to libfuse. libfuse then walks its internal directory entry cache
to construct a path string for that nodeid and hands it to fuse2fs.
fuse2fs then walks the ondisk directory structure to find the ext2 inode
number. Every time.
Create a new fuse4fs server from fuse2fs that uses the lowlevel fuse
API. This affords us direct control over nodeids and eliminates the
path wrangling. Hardlinks can be supported when iomap is turned on,
and metadata-heavy workloads run twice as fast.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
Comments and questions are, as always, welcome.
e2fsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/e2fsprogs.git/log/?h=fuse4fs-fork
---
Commits in this patchset:
* fuse2fs: port fuse2fs to lowlevel libfuse API
* fuse4fs: drop fuse 2.x support code
* fuse4fs: namespace some helpers
* fuse4fs: convert to low level API
* libsupport: port the kernel list.h to libsupport
* libsupport: add a cache
* cache: disable debugging
* cache: use modern list iterator macros
* cache: embed struct cache in the owner
* cache: pass cache pointer to callbacks
* cache: pass a private data pointer through cache_walk
* cache: add a helper to grab a new refcount for a cache_node
* cache: return results of a cache flush
* cache: add a "get only if incore" flag to cache_node_get
* cache: support gradual expansion
* cache: implement automatic shrinking
* fuse4fs: add cache to track open files
* fuse4fs: use the orphaned inode list
* fuse4fs: implement FUSE_TMPFILE
* fuse4fs: create incore reverse orphan list
---
lib/ext2fs/jfs_compat.h | 2
lib/ext2fs/kernel-list.h | 111 -
lib/support/cache.h | 177 +
lib/support/list.h | 901 +++++++
lib/support/xbitops.h | 128 +
configure | 50
configure.ac | 31
debugfs/Makefile.in | 12
e2fsck/Makefile.in | 56
lib/config.h.in | 3
lib/e2p/Makefile.in | 4
lib/ext2fs/Makefile.in | 14
lib/support/Makefile.in | 8
lib/support/cache.c | 853 ++++++
misc/Makefile.in | 35
misc/fuse4fs.c | 6098 ++++++++++++++++++++++++++++++++++++++++++++++
misc/tune2fs.c | 4
17 files changed, 8319 insertions(+), 168 deletions(-)
delete mode 100644 lib/ext2fs/kernel-list.h
create mode 100644 lib/support/cache.h
create mode 100644 lib/support/list.h
create mode 100644 lib/support/xbitops.h
create mode 100644 lib/support/cache.c
create mode 100644 misc/fuse4fs.c
Powered by blists - more mailing lists