lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260206042521.GJ7686@frogsfrogsfrogs>
Date: Thu, 5 Feb 2026 20:25:21 -0800
From: "Darrick J. Wong" <djwong@...nel.org>
To: Chris Mason <clm@...a.com>
Cc: miklos@...redi.hu, joannelkoong@...il.com, bernd@...ernd.com,
	neal@...pa.dev, linux-ext4@...r.kernel.org,
	linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 29/31] fuse: disable direct reclaim for any fuse server
 that uses iomap

On Thu, Feb 05, 2026 at 10:57:15AM -0800, Chris Mason wrote:
> "Darrick J. Wong" <djwong@...nel.org> wrote:
> > From: Darrick J. Wong <djwong@...nel.org>
> > 
> > Any fuse server that uses iomap can create a substantial amount of dirty
> > pages in the pagecache because we don't write dirty stuff until reclaim
> > or fsync.  Therefore, memory reclaim on any fuse iomap server musn't
> > ever recurse back into the same filesystem.  We must also never throttle
> > the fuse server writes to a bdi because that will just slow down
> > metadata operations.
> 
> Hi everyone,
> 
> I'm trying out my AI review prompts on a few more trees, and I ran it
> on the fuse-iomap-cache branch:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=fuse-iomap-cache
> 
> This was the latest email lore could find, apologies if I grabbed the wrong one.
> 
> Hopefully the review below is useful, please let me know if there are places it
> has gone wildly wrong.  I kept the header to make it obvious exactly which
> commit I'd run through:
> 
> commit 3bdd586d6fa00e067dbc3676642ac75e54ec3031
> Author: Darrick J. Wong <djwong@...nel.org>
> 
> fuse: disable direct reclaim for any fuse server that uses iomap
> 
> Any fuse server that uses iomap can create a substantial amount of dirty
> pages in the pagecache because we don't write dirty stuff until reclaim
> or fsync. Therefore, memory reclaim on any fuse iomap server must avoid
> recursing back into the same filesystem and throttling the fuse server.
> 
> > diff --git a/fs/fuse/fuse_iomap.c b/fs/fuse/fuse_iomap.c
> > --- a/fs/fuse/fuse_iomap.c
> > +++ b/fs/fuse/fuse_iomap.c
> > @@ -1019,6 +1019,12 @@ static void fuse_iomap_config_reply(struct fuse_mount *fm,
> >  	 */
> >  	fc->sync_fs = true;
> >
> > +	/*
> > +	 * We could be on the hook for a substantial amount of writeback, so
> > +	 * prohibit reclaim from recursing into fuse or the kernel from
> > +	 * throttling any bdis that the fuse server might write to.
> > +	 */
> > +	current->flags |= PF_MEMALLOC_NOFS | PF_LOCAL_THROTTLE;
> 
> Are these process flags being set on the correct process? Looking at the
> call chain:
> 
> Sync path:
>   fuse_fill_super() -> fuse_iomap_mount() -> fuse_iomap_config_reply()
> 
> Async path:
>   fuse_request_end() -> args->end() -> fuse_iomap_config_reply()
> 
> In the sync path, current points to the mount(2) process which exits
> after mounting completes. In the async path, current points to whichever
> process writes the reply to /dev/fuse (the fuse daemon thread) or a
> kernel worker thread for virtio-fs.
> 
> The commit message states the goal is to prohibit reclaim from recursing
> into the fuse server, but the fuse server is a separate userspace process.
> Setting flags on the mount process (which exits) or on a random fuse
> daemon thread (only once during init) doesn't protect the fuse server
> threads that actually perform I/O operations.
> 
> Should the flags be set on the correct process context? Or if they need
> to be set here, should they follow the pattern used in fs/nfsd/vfs.c and
> fs/nfs/localio.c which save and restore the flags?
> 
> Looking at fs/nfsd/vfs.c:1438-1509:
> 
>   unsigned long old_flags = current->flags;
>   current->flags |= PF_LOCAL_THROTTLE;
>   ...
>   current_restore_flags(pflags, PF_LOCAL_THROTTLE);
> 
> and fs/nfs/localio.c:824-828:
> 
>   unsigned long old_flags = current->flags;
>   current->flags |= PF_LOCAL_THROTTLE;
>   ...
>   current_restore_flags(old_flags, PF_LOCAL_THROTTLE);
> 
> Without saving and restoring, the flags permanently alter the process
> state. For the mount process this is wasted when it exits, and for fuse
> daemon threads this causes permanent behavior changes.

Hrmm.  Normally the fuse server itself calls mount(2) via libfuse to set
up the mount point, so this does (AFAICT) set PF_ flags on the fuse
server itself.

However, any fuse setup that doesn't handle things this way would indeed
set the PF_ flags on the wrong process.  There are (a) other fuse
libraries out there, and (b) the fuse service architecture mentioned
downthread has a mount helper that starts up a fuse server on the other
end of a unix socket, passes it resources including /dev/fuse to start
up, and then calls mount(2) itself instead of the fuse server doing
that.

That part's broken, and I'll have to think about how to solve that.
Or maybe someone else will tell me this is all undesirable and I'll just
drop this patch. :)

--D

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ