linux-ext4 - [PATCHSET 4/6] fuse2fs: use fuseblk mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <174786678650.1385354.14994099236248944550.stgit@frogsfrogsfrogs>
Date: Wed, 21 May 2025 15:34:53 -0700
From: "Darrick J. Wong" <djwong@...nel.org>
To: tytso@....edu
Cc: linux-ext4@...r.kernel.org
Subject: [PATCHSET 4/6] fuse2fs: use fuseblk mode

Hi all,

While I was testing pre-iomap fuse2fs, I noticed a strange behavior of
fuse2fs.  When the filesystem is unmounted, the VFS mount goes away and
umount(3) returns before op_destroy is even called in fuse2fs.  As a
result, a subsequent fstest can try to format/mount the block device
even though fuse2fs hasn't even finished flushing dirty data to disk
or closed the block device.

This causes various weird test failures.  More alarmingly, this also
means that the age old advice that it's safe to yank a USB stick after
unmount returns is not actually true for fuse2fs.  This can lead to user
data loss.

There is a solution to this -- fuseblk mode.  In this scheme, fuse2fs
tells the kernel which block device it wants, the kernel opens the block
device, and it upcalls FUSE_DESTROY before releasing the block device
or the in-kernel super_block.  This gives us the desired property that
when unmount completes, it's safe to remove the device.

Unfortunately, this comes at a price.  Because the kernel insists upon
opening the fuseblk device in O_EXCL mode, we have to close the
filesystem before starting up fuse, and reopen it in op_init.  This
creates a largeish TOCTOU race window and increases mount times.  Worse
yet, if CONFIG_BLK_DEV_WRITE_MOUNTED=n, then this won't even work.

The last patch also registers fuse2fs as a process involved in memory
reclamation to prevent memory allocation deadlocks.

If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.

Comments and questions are, as always, welcome.

e2fsprogs git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/e2fsprogs.git/log/?h=fuse2fs-use-fuseblk
---
Commits in this patchset:
 * fuse2fs: rework FUSE2FS_CHECK_CONTEXT not to rely on global_fs
 * fuse2fs: get rid of the global_fs variable
 * fuse2fs: close filesystem from op_destroy
 * fuse2fs: split filesystem mounting into helper functions
 * fuse2fs: make norecovery behavior consistent with the kernel
 * fuse2fs: check for recorded fs errors before touching things
 * fuse2fs: recheck support after replaying journal
 * fuse2fs: improve error handling behaviors
 * libext2fs: make it possible to extract the fd from an IO manager
 * fuse2fs: use fuseblk mode for mounting filesystems
---
 lib/ext2fs/ext2_io.h         |    4 
 debian/libext2fs2t64.symbols |    1 
 lib/ext2fs/io_manager.c      |    8 +
 lib/ext2fs/unix_io.c         |   15 +
 misc/fuse2fs.c               |  491 +++++++++++++++++++++++++++++++-----------
 5 files changed, 387 insertions(+), 132 deletions(-)